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Preface 



Reliable computing techniques are essential if the validity of the output of a nu- 
merical algorithm is to be guaranteed to be correct. Our society relies more and 
more on computer systems. Usually, our systems appear to work successfully, 
but there are sometimes serious, and often minor, errors. Validated computing 
is one essential technology to achieve increased software reliability. Formal ri- 
gor in the definition of data types, the computer arithmetic, in algorithm design, 
and in program execution allows us to guarantee that the stated problem has (or 
does not have) a solution in an enclosing interval we compute. If the enclosure 
is narrow, we are certain that the result can be used. Otherwise, we have a clear 
warning that the uncertainty of input values might be large and the algorithm 
and the model have to be improved. The use of interval data types and algo- 
rithms with controlled rounding and result verification capture uncertainty in 
modeling and problem formulation, in model parameter estimation, in algorithm 
truncation, in operation round-off, and in model interpretation. 

The techniques of validated computing have proven their merits in many 
scientific and engineering applications. They are based on solid and interesting 
theoretical studies in mathematics and computer science. Contributions from 
fields including real, complex and functional analysis, semigroups, probability, 
statistics, fuzzy interval analysis, fuzzy logic, automatic differentiation, computer 
hardware, operating systems, compiler construction, programming languages, 
object-oriented modeling, parallel processing, and software engineering are all 
essential. 

This book, which contains the proceedings of the Dagstuhl Seminar 03041 
‘Numerical Software with Result Verification’ held from January 19 to 24, 2003, 
puts particular emphasis on the most recent developments in the area of validated 
computing in the important fields of software support and in applications. 

We have arranged the contributions in five parts. The first part deals with 
languages supporting interval computations. The paper by Wolff von Gudenberg 
studies different object-oriented languages with respect to their abilities and 
possibilities to efficiently support interval computations. The contribution by 
Hofschuster and Kramer gives an overview of the C-XSC project, a C++ class 
library supporting intervals, the precise scalar product, standard functions with 
intervals, and various class abstractions useful for scientific computation. 

The second part is devoted to software systems and tools. In a joint pa- 
per, Kearfott, Neher, Oishi and Rico present and compare four such systems: 
GlobSol, a Fortran-based library for the verified solution of nonlinear algebraic 
systems of equations and global optimization; ACETAF, an interactive tool for 
the verified computation of Taylor coefficients; Slab, a complete Matlab-style 
high-performance interval linear algebra package; and (Fixed) CADNA, a tool 
for assessing the accuracy and stability of algorithms for embedded systems 
relying on a fixed-point arithmetic. Whereas the first three software systems 
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use (machine) interval arithmetic, the latter is based on the CESTAC method 
and its stochastic arithmetic. Going beyond double precision in machine inter- 
val arithmetic is the topic of the paper by Grimmer, Petras and Revol. They 
describe intPackX, a Maple module which, among others, provides correctly ro- 
unded multiprecision evaluation of standard functions, and the two C/C-l — |- ba- 
sed libraries GMP-XSC and MPFI. The authors include several examples where 
multiple precision interval arithmetic is of primary importance, for example to 
show the existence of Kronrod-Patterson rules for numerical integration or in 
the numerical solution of ODEs in Asian options pricing. The last paper in this 
part is by Corliss and Yu who report on their approach and their strategy and 
experience when testing a preliminary version of an interval software package 
for its correctness. 

As software supporting interval and validated computation becomes more 
and more popular, we witness an increasing number of new modeling techni- 
ques using intervals. The third part of this volume contains five papers on these 
topics. Kieffer and Walter consider parameter and state estimation in dyna- 
mical systems involving uncertain quantities. For cooperative models, they use 
interval-based set inversion techniques to obtain tight bounds on the parameters 
and states under the given uncertainties. In an additional paper, together with 
Braems and Jaulin, they propose a new, interval computation-based technique 
as an alternative to computer algebra when testing models for identifiability. 
Auer, Kecskemethy, Tandl and Traczinski show that interval analysis provides 
new opportunities to model multibody systems and they present an advanced 
software system MOBILE that includes such interval techniques. Biihler, Dy- 
llong and Luther discuss reliable techniques in computational geometry. They 
focus on distance and intersection computations, an area where slightly wrong 
floating-point results may produce a completely wrong view of the geometry. 
The last paper by Alefeld and Mayer deals with the more fundamental issue of 
how interval arithmetic iterations behave when applied to solve linear systems 
with a singular coefficient matrix. 

Part four considers various applications of validation techniques in science 
and engineering. It starts with a contribution by Beelitz, Bisclrof, Lang and 
Schulte Althoff on methods that guarantee the absence of singularities in cer- 
tain models for the analysis and design of chemical processes. This is of primary 
importance, since otherwise multiple steady states may result in spontaneous 
fluctuations which may even damage the chemical reactor. Fausten and Hafi- 
linger consider workload distributions of service systems in telecommunications 
under quality-of-service aspects. They develop a method to determine workload 
distributions involving a verification step based on interval arithmetic. Three im- 
portant problems in geodesy are dealt with in the paper by Borovac and Heindl, 
who present verified methods for the direct and the inverse problem of geodetic 
surveying and the three-dimensional resection problem. Among others, enclosure 
methods for ODEs turn out to be very useful here. Schichl describes the CO- 
CONUT project, a large, European, modular software project for constrained 
global optimization. The paper explains the architecture of this software system, 
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which uses the FILIB++ library for its components based on interval arithme- 
tic. Finally, the paper by Oussena, Henni and Alt describes an application from 
medical imaging in which verified computations would be of great help. 

The last part is devoted to alternative approaches to the verification of nume- 
rical computations. The contribution by Lester shows how one can use the formal 
specification checker PVS to validate standard functions like arctan and some 
exact arithmetic algorithms. Granvilliers, Kreinovich and Muller present three 
alternative or complementary approaches to interval arithmetic in cases where 
uncertainty goes beyond having bounds on input data: interval consistency tech- 
niques, techniques using probabilistic information and techniques for processing 
exact real numbers. This part closes with the paper by Putot, Goubault and 
Martel, who propose the use of static code analysis to study the propagation of 
round-off. They also present a prototype implementation of their approach. 

We would like to thank all authors for providing us with their excellent 
contributions and for their willingness to join in groups to present a coherent 
description of related research and software. We are also grateful to Springer- 
Verlag for the fruitful cooperation when preparing this volume and, last but not 
least, to the referees listed below. 



January 2004 Rene Alt 

Andreas Frommer 
R. Baker Kearfott 
Wolfram Luther 
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OOP and Interval Arithmetic — 
Language Support and Libraries 



Jiirgen Wolff von Gudenberg 

Universitat Wurzburg 
97074 Wurzburg, Germany 
wolf f Sinf ormat ik . uni-wuerzburg . de 



Abstract. After a short presentation of the paradigms of object ori- 
ented programming and interval arithmetic the languages C++ and Java 
are treated in more detail. Language features are regarded with respect 
to their support for the definition or application of interval arithmetic. 
In the final section the 4 libraries Profil/BIAS, C-XSC, filib++ as well as 
Sun Forte C++ are compared with respect to functionality and efficiency. 



1 Paradigms 

1.1 Object Oriented Programming 

An object oriented program simulates a part of the real or an imaginary world. 
Objects are constructed and communicate with each other via messages. Classes 
are defined to describe objects of the same kind. The class is the central and most 
important construct of object oriented programming languages. A class defines 
a type by giving attributes to describe a data structure and methods to specify 
the behavior of objects of that type. Using encapsulation details of the structure 
and implementation may be hidden, a class hence defines an abstract data type. 
Separation of interface and implementation is a commonly used pattern as well 
as hiding details of the representation or internal execution of the methods. 
Objects are instances of classes in the sense of data types, they have attributes 
determining their state and thus are elements of the domain. Objects control 
their own state, a method call usually stimulates an object to report or change 
its state. The standard data types like integers or floating-point numbers are 
available as primitive types, the elements are just values, not objects. 

Object oriented languages usually provide several forms of polymorphism. 
Operator or function overloading, parameterized data types or inheritance are 
the main kinds of polymorphism. Templates parameterized by a data type may 
be instantiated to create a new data type. Homogeneous lists or matrices are a 
typical example. Inheritance based hierarchical programming, in particular, is 
often used as synonym for object oriented programming. It allows for the def- 
inition of containers with very general element types that then also can host 
specializations or derived types. Iterators are provided to pass through the con- 
tainer structure. 
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Hierarchies of data types may be built where, usually, interfaces or abstract 
classes are near the root and their descendants, implementations or specializa- 
tions follow towards the leaves. In contrast to these general structures arrays 
nearly play any role. Interfaces - explicitly known in Java and implemented as 
fully abstract classes in C++ - are used to define an abstract data type. An 
interface provides the signatures of methods of implementing classes. Common 
behavior for all descendants may be predefined in an abstract class by a call of 
abstract methods. 

Given abstract add and negate methods of a class Fp, e.g., the subtract 
method can be defined for all descendants as 

Fp subtract (Fp b) { return add(b. negate ()) } 



1.2 Interval Arithmetic 

The main concern of interval arithmetic is to compute reliable bounds. The 
arithmetic interval operations, therefore, use directed rounding, interval versions 
of elementary functions and lattice or set operations are provided. Since many 
algorithms in scientific computing are not only declared for scalars, interval 
vectors and matrices are very important. 

The most prominent applications of interval arithmetic are the global opti- 
mization [4,2] and the result verification using fixed point theorems [7,3]. 

Computation of the range of a function is one of the key problems in inter- 
val arithmetic. We will use it to investigate the degree of support of interval 
arithmetic by object oriented languages. There are many different algorithms to 
enclose the range. Surprisingly enough, even the most simplistic approach can 
be defined with two possible flavors of semantics, and no decision for one or the 
other seems to be convincing. 

Interval Evaluation 

/(x) = {/(x) |x € x} denotes the range of values of the function / :Bj CR—>] R 
over the interval x C Df. 

An enclosure of the range can be implemented by interval evaluation of the 
formula expression for /. 

Definition 1 The interval evaluation f : IK — > IK of f is defined as the 
function that is obtained by replacing every occurrence of the variable x by the 
interval variable x and by replacing every operator by its interval arithmetic 
counterpart and every elementary function by its range. 

We call this mode the normal or interval mode. Note that arithmetic opera- 
tors and elementary functions are defined on their natural domain and produce 
an error, if the argument contains a point that is not in the domain. Hence, this 
definition only holds, if all operations are executable without exception. 

Containment Evaluation 

Alternatively in the containment or extended mode a range enclosure computes 
the topological closure over R* = iU {— oo} U {oo} by extending the domain 
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of real arithmetic operators to R* and that of elementary functions to their 
topological closure, see [8]. No errors are invoked, but the resulting interval may 
be R* or 0. In the following definition P denotes the power set. 

Definition 2 Let /:DjCR->R, then the containment set /* : PR* PR* 
defined by 

/*(x) := {f(x)\x G x n D f } U {liniD^-^. f(x)\x* G x} C R* 
denotes the extended range of f. 

Definition 3 The containment evaluation f* : HR* — > HR* of f is defined as 
the function that is obtained by replacing every occurrence of the variable x by 
the interval variable x and by replacing every operator or function by its extended 
interval arithmetic counterpart. . 

Theorem 1. 



/(x) C f(x) (1) 

/(x) C /*(x) C f*(x) (2) 

The proof of (1) is well known, a similar step by step proof for (2) is carried 
out in [8]. 

Discussion 

Since arithmetic operations as well as the elementary functions are continuous 
over their domain and since this continuity is lost by the extended operations, 
only the interval mode should be used, if continuity is a presupposition as for 
example in result verification algorithms [3] using Brouwer’s fixed-point theorem. 
In the containment mode additional constraints have to be added to ensure 
continuity. 

The normal mode, however, may be too restrictive in global optimization [2]. 
Here it is correct to intersect argument interval and domain in order to obtain 
a feasable set. 

2 Requirements and Realisations 

In this section we enumerate the requirements which are necessary, recom- 
mended, helpful, or at least nice to embed interval arithmetic in the object 
oriented languages C++ and Java. 



2.1 Requirements for Interval Arithmetic 

— A data type interval can be defined, (mandatory) 

— Vectors and matrices are available, (mandatory) 

— Floating-point arithmetic is clearly specified, (mandatory) 

— Directed rounding is provided, (recommended) 

— Intervals can be read and written, (mandatory) 
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— Interval literals are accessible, (helpful) 

— Operators and functions can be overloaded, (recommended) 

— Functions may be passed as parameters, (recommended) 

— Evaluation of expressions may be redefined by the user, (helpful) 

— Data types can be parameterized, (helpful) 

Every programming language of interest supports the definition of data types, 
vectors and matrices. 

Floating-point arithmetic is available in hardware. For the definition of inter- 
val arithmetic a clear specification of the performable operations, their accuracy 
and rounding mode is mandatory. 

Even if we can assume that IEEE arithmetic is provided on every computer, 
we can not be sure that directed roundings are immediately accessible. Therefore 
we consider 7 different rounding procedures. V denotes the function that maps 
a real number to its greatest lower floating-point neighbour, A to the least 
upper, and O to the nearest floating-point neighbour. Usually the hardware 
rounding mode has to be switched explicitly. This switching may be an expensive 
operation. 

For the operation [z,z] = \x,x\ + [y,y\ the rounding procedures are 

— native: set V; z = V(x + y); set A ;z = A (x + y) 

— native-switch : set V; z = V(x + y)\ set A; z = A(x + y); set O 

— native-onesided : set V; z = V(x + y); z = V(— V(— x — y)) 

— native-onesided-switch: set V; z = V(x + y); z = V(— V(— x — y))\ set O 

— no switch: z = V(x + y)\ ~z = A(x + y) 

— multiplicative: z= (x + y) * pred( 1.0); z = (x + y) * succ(l.O) 

— pred-succ: z = pred(x + y); z = succ(x + y) 

The first 4 procedures expect that directed rounding is available in hardware 
and can be selected via a switch, the onesided roundings need only one switch. If 
the switch back to round to nearest is omitted, the semantics of the floating-point 
arithmetic, that usually works with round to nearest, is changed. 

The no-switclr rounding procedure assumes that all 3 rounding modes are 
immediately accessible. Multiplicative rounding may be applied, if only round 
to nearest is provided by the hardware. The predecessor and successor of a 
floating-point number may be obtained by a hardware instruction or by bit 
manipulation. 

Input and output as well as interval literals may be realized by an intervals 
string conversion. 

For the realisation of algorithms like interval Newton method or range eval- 
uation it is strongly recommended that functions may be passed as parameters. 
The definition of a particular non-standard evaluation of expressions is a further 
helpful ingredient (see # expressions in Pascal-XSC ([5]). 

2.2 Realisation in Java 

Java is one of the very few languages that specify the semantics of their floating- 
point arithmetic. There are even two modes to use IEEE arithmetic. In the 
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strictfp mode every intermediate result occuring in an evaluation of an ex- 
pression has to be rounded to the nearest number of the corresponding primitive 
data type double or float, hence the same result is obtained on any computer. 
In the default mode, however, registers with a more precise floating-point for- 
mat may be used as well as combined operations like the fused multiply and add 
operation. Exceptions for the IEEE traps overflow or division by zero, e.g., are 
never raised in any of the two modes. 

Directed roundings have to be accessed by native, i.e. non- Java, methods. 
Those methods can be defined in a utility class FPU. 

public final class FPU { 

public static final native double addDown (double x, double y) ; 
public static final native double mulUp (double x, double y) ; 



Since there are no global functions in Java these utility classes are really 
necessary. The standard class Math provides the elementary functions. 

An interval class may be defined as follows 

public class Interval { 

// Constructor 

public Interval (double x, double y) { 
inf = x < y ? x : y; 
sup = x > y ? x : y; 

> 



// Access and Utility methods 

public double getlnfO { 
return inf ; 

> 

public double diamQ { 

return FPU. subUp(sup, inf); 

> 

II ... 

/ / updating Arithmetic methods 

public Interval sub (Interval other) { 
double tmp = other. inf; 
inf = FPU. subDown(inf , other. sup); 
sup = FPU. subUp(sup, tmp); 
return this; 

> 

II ... 
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// Arithmetic methods or functions 
public Interval f sub (Interval other) -[ 
Interval that = new Interval (); 
that . inf = FPU. subDown (this . inf , other. sup); 
that . sup = FPU. subUp (this . sup, other. inf); 
return that ; 

} 

// ... 

protected double inf, sup; 



Access to the protected or private attributes is only possible by explicitly pro- 
vided public methods like get Inf () . Thus the construction of an illegal interval 
with inf>sup is prohibited. 

Overloading of operators is not allowed in Java, but functions (methods) and 
constructors can be overloaded. 

Whereas all primitive data types like float and double follow the usual value 
semantics, Java prescribes reference semantics for all class types like Interval. 
Unexpected side effects due to aliasing may occur. The updating versions of 
operations change the state of the object for which the method was called. For 
the expression 
x=x-x/z 

x. sub(x . div(z) ) overwrites the value of x before the quotient is sub- 
tracted, whereas x.fsub(x.fdiv(z)) yields the expected result. 

General containers with corresponding iterators are available, but they can- 
not be used for primitive types without wrapping those into classes. 

Vectors and matrices can be built, but a drawback of Java for scientific 
computing is certainly the fact that matrices do not allocate contiguous memory. 

Because of these deficiencies of Java three major issues have been raised by 
the Java Grande forum 1 . 

1. The extension of floating-point arithmetic that led to the second (default) 
mode for IEEE arithmetic. 

2. The MultiArray proposal which is still under consideration. 

3. Java Generics which will come. Although explicit wrapping of primitive types 
into classes will no longer be necessary, we think it will take some time 
until efficient instantiation of the parameterized containers will have been 
achieved. 

Nevertheless progress for Java compilers like semantic inlining or light weight 
objects has been made [1] in order to increase the performance. 

Functions may be represented as objects and hence passed as parameters as 
in the following example. 

1 http://www.javagrande.org/ 
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public class IntNewton { 
public UnaryFunction f; 
public IntNewton (UnaryFunction fun) { 
f = fun; 

} 

public Interval enclZero (Interval x, double eps) { 

Interval Mid = new Interval (); 

Interval fx, dfx; 

do { 

Mid.assign(x.midO) ; 

fx = f . evalRange (x ,mid() ) ; 

dfx = f . evalDerivRange (x) ; 

x . intersect (Mid . sub (f x . div (df x) ) ) ; 

} while (x.diamO > eps); 

return x; 

> 

> 



An object of the class IntNewton is constructed with an appropriate imple- 
mentation of the sample interface UnaryFunction. 

public interface UnaryFunction { 
public double eval (double x) ; 
public double evalDeriv (double x) ; 
public Interval evalRange (double x) ; 
public Interval evalDerivRange (double x) ; 
public Interval evalRange (Interval x) ; 
public Interval evalDerivRange (Interval x) ; 

> 

2.3 Realisation in C++ 

In C++ floating-point data types and their operations are implementation de- 
fined, the template numeric_limits<T> gives information about the properties 
like representation of infinities etc. The rounding mode has to be switched by as- 
sembler statements. This often causes problems with optimizing compilers which 
do not see the dependence of floating-point operations on those assembler state- 
ments. 

Overloading of operators and the existence of global functions allow for a 
smooth implementation of interval arithmetic. Type parameters can be used in 
templates to define interval arithmetic for different base types. All operations 
needed to instantiate the templates are imported via traits templates, in general. 
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These traits collections map base type specific operations to common names 
used in the arithmetic class. Pre-instantiated classes for the standard base types 
double or float realize this mapping during compile time. 

template <typename basetype> class interval 
{ 

interval<basetype> & interval<basetype> :: operator -= 

(interval<basetype> const & o) 

{ 

basetype tmp = o . INF ; 

INF=fp_traits<basetype> : : downward_minus (INF, o . SUP) ; 
SUP=fp_traits<basetype> : :upward_minus(SUP,tmp) ; 
fp_traits<basetype> : : reset () ; 
return *this; 

> 

II ... 

friend interval <basetype> & interval<basetype> :: operator - 

(interval<basetype> const & a, interval<basetype> const & b) ; 

} 

II ... 

template <typename basetype> 

interval<basetype> & interval<basetype> :: operator - 

(interval<basetype> const & a, interval<basetype> const & b) 

{ 

interval <basetype> that; 

that . INF=fp_traits<basetype> : : downward_minus (a. INF,b . SUP) ; 
that . SUP=fp_traits<basetype> : :upward_minus (a. SUP ,b . INF) ; 
fp_traits<basetype> : : reset 0 ; 
return that ; 

} 



As a friend the globally defined binary operator - has access to the internal 
structure of the interval data type. Parameters of class type can be passed by 
value, by reference or preferably by const reference. Hence, the expression 

x=x-x/z 

is exactly written like this. 

In C++ containers defined in the STL (Standard Template Library) are in 
general parameterized by their contents’ type. Efficient instantiation with prim- 
itive types is possible. In generic computing (using the STL) iterators combine 
containers with algorithms. Matrices are stored row- wise in a contiguous area. 
The matrix template library (MTL) 2 includes a large number of data formats 
and algorithms, including most popular sparse and dense matrix formats and 
functionality equivalent to Level 3 BLAS 3 . An appropriate instantiation with 
intervals is possible but not straightforward. 



2 http:/ /www. osl.iu.edu/research/mtl/ 

3 http://www.netlib.org/blas/ 
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There are mainly 4 ways to pass a function as a parameter 

— by a function pointer, double (*f ) (double) 

— as virtual function with late binding as in Java 

— via a function object that overloads the function call operator () via template 
parameter 

— using expression templates 

Expression templates [10] represent complete expressions ’symbolically’ by recur- 
sive templates and allow for user defined evaluation strategies via instantiation. 
Since this is done during compilation time, efficiency is not lost. We have applied 
this concept for the accurate evaluation of dotproducts [6] . 

The definition of a function object is not only elegant but also most efficient, 
since the first two methods rely on a runtime dereference. 

It is possible in C++ to overload the function call operator ( ) . A call of this 
postfix operator for an object then exactly looks like a call of a function with 
the object’s name. 

Here is the C++ example of the interval Newton method. 

template<class T_fun, class T_der> 
interval enclZero (interval x, double eps, 

T_fun const & f, T_der const & df) { 
interval fx; 
do { 

fx = x.midO - f (x .mid() )/df (x) ; 
x . intersect (f x) ; 

} while (x.diamO > eps); 
return x; 

> 



// example use 

y = enclZero (interval (0 . 0 , 10 . 0) , le-8, 
MyFunctionO ,MyDerivative()) ; 



class MyFunction 

{ 

public : 

interval operator () (double x) const 
{ interval X(x); 

return cos(X) + sin(X*X) ; } 

>; 

class MyDerivative 

{ 

public : 
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interval operator () (interval X) const 
{ return -sin(X) + 2*X*cos (X*X) ; } 



Here we selected a very simple interface, a more sophisticated implementation 
using expression templates and automatic differentiation is possible. 



3 Interval Libraries 

All considered libraries are written in C++ . We do not know any publicly avail- 
able, widely used Java interval library. 

All four libraries contain the arithmetic operators as global functions and 
the updating operators as methods. They provide a set of elementary functions, 
lattice or set operations like intersection or interval hull and a set of relational 
operations. 

Differences are in the definition of the data type and rounding mode as well 
as in some further features. 



3.1 C-XSC 

C-XSC 4 is a comprehensive library. It supports intervals of base type double, 
and complex intervals. There is a version with software floating-point arithmetic 
and pred-succ rounding procedure whereas the new version relies on hardware 
arithmetic. The normal interval evaluation of ranges is supported. 

The set of elementary functions includes exponential, logarithmic, trigono- 
metric and hyperbolic functions as well as their inverses. All functions are im- 
plemented with 1 or 2 ulp accuracy. 

C-XSC provides global operators for the set operations. 

I hull 
& intersect 
<=, >= membership tests 
< , > means interior 
! zero included 

The only relational operations are equality and non-equality. 

Input/output with proper rounding is possible with streams or strings. 
C-XSC further provides 

— vectors and matrices 

— datatype dotprecision 

— dynamic multiple precision arithmetic and standard functions 

— problem solving routines 

4 http: / / www.math.uni-wuppertal.de/org/WRST /xsc/cxsc.html 
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3.2 Profil/BIAS 

Profil/BIAS 5 provides intervals for base type double in normal mode. 

There are functions delivering an enclosure of the result of an arithmetic 
operation with two floating-point numbers. The set of lattice, relational and 
elementary functions is similar to C-XSC. 

It has a sophisticated vector and matrix package and supports multiple pre- 
cision interval arithmetic. 

The current version is from 1994 and, hence, has some problems with newer 
compilers. 



3.3 Sun Forte 

The interval arithmetic library from Sun 6 features the extended mode and offers 
some compiler support. The interval class is given as a template, specializations 
for float, double, longdouble exist. The rounding mode is native-onesided. 

There are convenient input /output features which manipulate the decimal 
string representation of binary floating-point numbers. There is, of course, a 
constructor with a string, input as well as output values are properly rounded, 
the latter in the decimal external format. 

Single number input /output are provided, the number represents the mid- 
point, the radius of the interval is one decimal digit in the last place of the mid- 
point representation. E.g. output of [2.34499, 2.34501] yields 2.34500. During 
input to a program, [0.1,0. 1] = [0.1] represents the point, 0.1, while using single- 
number input/output, 0.1 represents [0, 0.2]. 

The membership tests are implemented by functions, the operators are used 
for the set relational operations. Additionally possibly and certainly relational 
operations are provided. Possibly means that there are points in either interval 
for which the relation holds, certain relations hold for all points. 

3.4 fllib++ 

filib++ 7 is the newest of the libraries, the interface is similar to the Sun li- 
brary whereas the implementation of the elementary functions is an accelerated, 
slightly less accurate, but rigorous version based on the C-XSC functions. 

The interval class is given as a template with 3 parameters, the base type, 
the rounding mode and the evaluation mode. 

Operators for the different base types are imported via traits templates. 
Specializations for float, double exist. 

The rounding mode may be set to all 7 procedures, listed in section 2.1. 
There are three possible choices for the evaluation mode parameter. The 
default is the normal mode, the extended mode can be chosen or a combination 

5 http://www.ti3.tu-harburg.de/Software/PROFILEnglisch.html 

6 http://docs.sun.com/source/816-2465/index.html 

' http: / / www.math.uni-wuppertal.de/org/WRST /software/filib.html 
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of both modes that computes in the extended mode but sets a flag, whenever 
the normal mode would have raised an exception. This continuity flag informs 
the user, whether a continuity condition has been violated. 

Input and output facilities are somewhat restricted in filib++ . The string 
constructor relies on the fact that the decimal to binary conversion is accurate. 
(The shipped default conversion routine has this property.) Output prints the 
two bounds using the default binary to decimal conversion. Additionally the bit 
image of the bounds can be output. 



3.5 Timings 

In the last section we want to present some performance tests of the arithmetic 
operators in each library or with different rounding procedure. Note that these 
results heavily depend on the underlying hardware, operating system, and com- 
piler. Individual checks should be done to determine the most efficient version. 

We tested the arithmetic operations in a loop, the numbers (double) were 
randomly generated into vectors of different lengths. The processor was a 2GHz 
Pentium 4 running under Linux. For filib++ we used the gcc 3.2.1 compiler with 
optimization level 03, for Profil/BIAS and C-XSC we had to choose gcc 2.95.3 
optimization level 03 or 01, respectively. 

A newer version of C-XSC that exploits the hardware arithmetic is in prepa- 
ration. The performance will grow by a factor of 10, approximately. 

Comparison of Libraries 

The figures in the following tables denote MIOPs (million interval operations 
per second). 



Library 


+ 


- 


* 


/ 


filib++ traits 


22.4 


22.2 


11.4 


8.9 


filib++ macro 


17.7 


17.6 


10.9 


8.0.97525 


profil 


11.6 


11.3 


7.6 


9.8 


cxsc-1 


1.8 


1.5 


1.3 


0.7 



The fastest traits version of filib++ was tested against an older version using no 
templates but macros, the Profil/BIAS library and the old version of C-XSC. 
The table shows that the new compiler technology makes the macro version 
obsolete. 

Timings Rounding Mode 

The dependence on the rounding mode is tested in the next table, where all 
rounding procedures in filib++ were compared. Note that in this case no-switclr 
means no rounding, since the processor needs a switch to change the rounding 
mode. This mode does not deliver reliable bounds, it is only tested for compar- 



ison. 
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Rounding mode 


+ 


- 


* 


/ 


native 


22.4 


22.2 


11.4 


8.8 


native-switch 


3.9 


3.9 


3.5 


3.0 


native-onesided 


20.9 


21.2 


13.9 


8.2 


native-onesided-switch 


19.2 


19.3 


8.9 


6.3 


no-switch 


24.7 


24.6 


16.4 


9.2 


multiplicative 


8.8 


8.9 


6.1 


6.2 


pred-succ 


7.5 


7.8 


1.5 


1.7 



We think that the bad performance of native-switch is caused by the archi- 
tecture of the processor that can handle two but not three switches effectively. 



Timings Extended Mode 

The next table displays the results for the extended mode. 



Rounding mode 


+ 


- 


* 


/ 


native 


18.7 


18.9 


4.5 


8.5 


native-switch 


3.6 


3.6 


2.5 


2.8 


native-onesided 


11.9 


11.9 


7.9 


6.3 


native-onesided-switch 


10.5 


10.6 


4.5 


5.0 


no-switclr 


22.0 


22.1 


10.6 


9.1 


multiplicative 


8.5 


8.5 


4.6 


5.6 


pred-succ 


6.8 


7.0 


0.5 


0.9 



Comparison with Sun Forte 

Finally we compared filib++ with the Sun Forte library. These benchmarks were 
performed on a Sun Ultra 60 with 2 processors running at 360 MHz. filib++ 
witlr rounding native-onesided-switch in extended mode was tested against Sun’s 
interval arithmetic. The filib++ benchmark was compiled by the gcc 3.2 compiler 
optimization level 03, the same program using Sun’s intervals was compiled by 
Sun’s CC compiler in default mode and with optimization level 05, respectively. 



Library 


+ 


- 


* 


/ 


filib++ traits 


3.2 


3.2 


1.9 


2.0 


Sun 


1.3 


1.3 


1.0 


1.1 


Sun (05) 


2.8 


2.7 


2.1 


1.8 



It turns out that the filib++ - gnu combination outperforms Sun. 

4 Conclusion 

Object orientation and interval arithmetic are complementary paradigms which 
well fit together. In our opinion the support of interval arithmetic in C++ is 
superior to that in Java. That is also evident by the fact that some C++ libraries 
are available and commonly used. Comparing the libraries shows that there are 
not so much differences, but some of them have really grown old and would 
benefit from a new updated release. 
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Abstract. In this note the main features and newer developments of 
the C++ class library for extended scientific computing C-XSC 2.0 will 
be discussed. 

The original version of the C-XSC library is about ten years old. But 
in the last decade the underlying programming language C++ has been 
developed significantly. Since November 1998 the C++ standard is avail- 
able and more and more compilers support (most of) the features of this 
standard. The new version C-XSC 2.0 conforms to this standard. Appli- 
cation programs written for older C-XSC versions have to be modified 
to run with C-XSC 2.0. Several examples will help the user to see which 
changes have to be done. Note, that all sample codes given in [6] have 
to be modified to work properly with C-XSC 2.0. 

All sample codes listed in this note will be made available on the web 
page http : / /www .math .uni-wuppertal . de/'xsc/cxsc/examples. 



1 Introduction 

For those who are not so familiar with C-XSC let us first motivate the library by 
quoting essential parts (with slight modifications) from the preface of the book 
[ 6 ]: 

The programming environment C-XSC (C++ for extended Scientific Com- 
puting) is a powerful and easy to use programming tool, especially for scientific 
and engineering applications. C-XSC is particularly suited for the development 
of numerical algorithms that deliver highly accurate and automatically veri- 
fied results. It provides a large number of predefined numerical data types and 
operators of maximum accuracy. The most important features of C-XSC are 
real, complex, interval, and complex interval arithmetic with mathematically 
defined properties; dynamic vectors and matrices; dotprecision data types (ac- 
curate dot products); predefined arithmetic operators with highest accuracy; 
standard functions of high accuracy; dynamic multiple-precision arithmetic and 
rounding control for the input and output of data. 



R. Alt et al. (Eds.): Num. Software with Result Verification, LNCS 2991, pp. 15-35, 2004. 
© Springer- Verlag Berlin Heidelberg 2004 
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Accumulation of numbers is the most sensitive operation in floating-point 
arithmetic. By that operation scalar products of floating-point vectors, matrix 
products etc. can be computed without any error in infinite precision arithmetic, 
making an error analysis for those operations superfluous. Many algorithms ap- 
plying that operation systematically have been developed. For others the limits 
of applicability are extended by using this additional operation. Furthermore, 
the optimal dot product speeds up the convergence of iterative methods (cited 
from [10,11]). C-XSC provides accurate dot products via software simulation 
(hardware support should increase the computation speed by 2 orders of mag- 
nitude, again, see [11]). Computing x*y for floating point vectors x, and y in 
C-XSC results in the best possible floating point result (exact mathematical re- 
sult rounded to the nearest floating point number). Using the new C-XSC data 
type dotprecision the user can even store the result of dot products of float- 
ing point vectors with even millions of components without any error. The so 
called staggered format allows multiple-precision computations. The realization 
of arithmetic operations for variables of this data type use extensively the accu- 
rate dot product. With appropriate hardware support for dot product operations 
the staggered arithmetic would be very fast. 

C-XSC consists of a run time system written in ANSI C and C++ including 
an optimal dot product and many predefined data types for elements of the most 
commonly used vector spaces such as real and complex numbers, vectors, and 
matrices. Operators for elements of these types are predefined and can be called 
by their usual operator symbols. Thus, arithmetic expressions and numerical al- 
gorithms are expressed in a notation that is very close to the usual mathematical 
notation. 

Additionally, many problem-solving routines with automatic result verifica- 
tion (e.g. C++ Toolbox for Verified Computing with one- and multi-dimensional 
solvers for systems of linear equations, linear optimization, automatic differenti- 
ation, nonlinear systems of equations, global optimization and further packages 
like slope and taylor arithmetic or quadrature and cubature of singular integrals) 
have been developed in C-XSC for several standard problems of numerical anal- 
ysis. All software is freely available. 



2 Overview on the New Version C-XSC 2.0 

Due to the following observations older C-XSC programs have to be modified 
slightly to run with C-XSC 2.0 (for details please refer to paragraph 4): 

— All C-XSC routines are now in the namespace cxsc. So you have to fully 
qualify names of C-XSC routines (e. g. cxsc::sin(cxsc::intval(3.0)) ) or you 
have to include the line using namespace cxsc; in your source code. 

— Now typecast constructors are available 

— Constant values formerly passed by reference are now passed by const ref- 
erences 

— Modifications in the field of subvectors and submatrices have been done 
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— The error handling is now done using the C++ exception handling mecha- 
nism (using try, catch, and appropriate exception classes) 

— The new version of the library uses templates extensively 

The source code of C-XSC 2.0 is freely available from 

http : / /www .math . uni-wuppertal . de/~xsc/xsc/ download . html and the 

source code of a new version of the C++ Toolbox for Verified Computing [1] 
which works with C-XSC 2.0 is also freely available from the same web site. 

3 Freely Available Software Based on C-XSC 2.0 

Here we list (additional) software based on C-XSC 2.0 which is freely available 
from our web-site: 

a) (Modified) Toolbox for Verified Computing (see [1]). This toolbox com- 
prises a couple of verification algorithms for one- and multi-dimensional numer- 
ical problems: 

al) The available one-dimensional problem solving routines are: 

— Accurate polynomial evaluation 

— Automatic differentiation 

— Nonlinear equations in one variable 

— Selfverifying global optimization 

— Accurate arithmetical expressions 

— Zeros of complex polynomials 

a2) The available multi-dimensional problem solving routines are: 

— Systems of linear equations 

— Linear optimization 

— Automatic differentiation (gradient, Jacobi-, Hesse matrix) 

— Nonlinear systems of equations 

— Global optimization 

b) Further available software packages are: 

— Interval slope arithmetic (Breuer) 

— Interval Taylor arithmetic (Breuer) 

— Mathematical functions for complex rectangular intervals (Westplral) 

— Verified quadrature and cubature of nonsingular and singular integrals 
(Wedner, see [8,20]) 

— Verified estimates for Taylor coefficients of analytic functions (Nelrer [16]) 

— Routines to compute rigorous worst case a priori bounds for absolute and/or 
relative errors of floating point algorithms (Bantle [7]) 

— Solvers for under- and overdetermined systems of linear equations 
(Holbig [3]) 

— Verified solutions of ordinary differential equations (Lohner [13]) 

You can download the source code of all software packages from 

http : / /www . math . uni-wuppertal . de/~xsc. 

There, you also find more specific information on the packages as well as some 
preprints. 
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4 Which Modifications in Source Codes Are Required? 



In this section we try to answer the most frequently asked questions of C-XSC 
users concerning the migration of older C-XSC application programs to the new 
C-XSC 2.0 version. For those who are familiar with the C++ standard [5] the 
source code modifications should be rather obvious (see e.g. Stroustrup [19], 
Meyers [14,15]). 

To make available the advanced input and output facilities (stream con- 
cept) of C-l — K you must include the headerfile iostream using the source line 
#include <iostream>. Note, the name of the header is not iostream. h. In 
general, the names of system header files coming with C-l — b do not have an 
extension. 

To perform conversions of interval constants given as strings C-XSC uses the 
header file #include <string>. This header introduces (dynamic) C++ strings 
with predefined operators. 

C-XSC delivers several header files. The extension of these files is .Irpp. The 
header files correspond to the additional numerical data types available in C- 
XSC (like interval, imatrix, cmatrix, . . .). The name of the header files are 



cdot . hpp 
cidot .hpp 
c imatrix. hpp 
c interval .hpp 
civector .hpp 
cmatrix . hpp 
complex . hpp 
cvector . hpp 



dot . hpp 
idot .hpp 
imath . hpp 
imatrix. hpp 
interval .hpp 
intmatrix.hpp 
intvector .hpp 
i vector .hpp 



l_complex.hpp 
l_imath.hpp 
l_interv.hpp 
l_real .hpp 
l_rmath.hpp 
limatrix.hpp 
li vector .hpp 
lrmatrix.hpp 



lrvector .hpp 
real .hpp 
rmath . hpp 
matrix, hpp 
rvector .hpp 



The leading 1 in the name of a header file indicates a long precision (staggered) 
data type, dot indicates dotprecision data types able to store dot products with- 
out errors (long accumulators). In contrast to system header files which are in- 
cluded in the form #include <header> C-XSC header files are included using 
double quotes #include "cxscheader .hpp". 

The result type of the routine mainO should be int. 

Newer C++ compiler implement the namespace concept more strictly. The 
standard namespace of C++ is called std. All C-XSC routines are defined in the 
namespace cxsc. If you don’t want to fully qualify the names of such routines 
(e. g. std: :cout, or cxsc: : interval) you should include the two source lines 



using namespace std; //make available names like cout, endl, ... 
using namespace cxsc; //make available names of C-XSC routines 



in your application code. 

The following simple example program demonstrates most of the points from 
above. It checks whether the number 0.1 is representable as a point interval in C- 
XSC. If this is not the case, the decimal number 0.1 is not exactly representable 
as a double number. 
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#include <iostream> //C++ stream concept for input and output 
#include <string> //ANSI C strings 

#include "interval .hpp" //C-XSC header file for data type interval 

using namespace std; //make available names like cout, endl, ... 

using namespace cxsc; //make available names of C-XSC routines 

int mainQ 

{ 

interval x; //x is an interval variable 

stringC [0 . 1 , 0 . 1] ") » x; //convert the interval constant to its 

//internal binary representation 
//(using directed roundings) 
if (Inf(x) != Sup(x)) 

cout << "Number x has no exact binary representation!"; 
else 

cout << "Number x has an exact binary representation!"; 

cout « endl << "x = " « x « endl; //decimal output using 

//C++ streams 

cout « Hex << "x = " « x « endl; //hexadecimal output 
return 0 ; 

} 

/* Output 

Number x has no exact binary representation! 
x = [ 0.099999, 0.100001] 

x = [+19999999999999e3FB , +1999999999999Ae3FB] 
*/ 

If your (older) application code contains calls to conversion functions like 
_interval( . . . ) you should now use constructor calls like interval ( . . . ) in- 
stead. The C-XSC conversion functions (starting with an underscore) are obso- 
lete. 

Several function signatures of C-XSC routines have been changed from ref- 
erence parameters (T& x) to const reference parameters (const T& x). The fol- 
lowing C++ sample program demonstrates some consequences. 

#include <iostream> 
using namespace std; 

void f (const doubleft x) { cout « "Formal argument with const" << endl; } 

void f(double& x) { cout « "No const qualifier" « endl; } 

int mainO 

{ 



double x=2; 
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f (1.0) ; 


//l. 


actual 


argument 


is 


not 


an 


lvalue 


f(x); 


Ill, 


x is an lvalue 










f (1.0+x) ; 


//3, 


actual 


argument 


is 


not 


an 


lvalue 


f (x+x) ; 
return 0 ; 


// 4, 


actual 


argument 


is 


not 


an 


lvalue 



} 

/* 

Formal argument with const 
No const qualifier 
Formal argument with const 
Formal argument with const 
*/ 



Note, due to the const qualifier the signatures in the two definitions of f() 
are different in C++! If we remove the first definition of f(), the function calls in 
the lines indicated by 1, 3, and 4 produce errors during the compilation process. 
In these cases the actual arguments are not lvalues whereas the formal argument 
of type double& (see the second definition of f) requires an lvalue. 

Note, that the two definitions 

void g(const double x) fcout << "Formal argument with const" << endl;} 
void g(double x) {cout « "No const qualifier" << endl;} 

are not allowed simultaneously in a C++ program unit. Here, the formal 
arguments are not declared as references. This implies that in both cases the 
actual argument in a function call is passed by value (the values of the actual 
arguments can not be changed in the body of the function). So an additional 
const qualification does not make sense. 

Operators like [] as member function of a class may be overloaded differ- 
ently for objects and const-objects. This is demonstrated by the following C++ 
sample code (the const between the parameterlist and the body of the operator 
definition indicates that in the body of the function the attributes of the left 
hand side object in a corresponding operator call are not modifiable): 

#include<iostream> 
using namespace std; 

typedef double T ; 

struct vector { 

vector (int k) //constructor 
{ 

st art = new T [k] ; 

for (int i=0; i<k; i++) start [i]= i; 

} 

//operator [] may be applied to vectors 

//elements are readable and writable (result type is T&) 

T& operator [] (int k) 

{ 
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cout << " [] without const ... " « endl; 
return start [k] ; 

} 

//operator [] may be applied to const vectors 
//elements are only readable (result type is const T&) 
const T& operator [] (int k) const 
{ 

cout << " [] with const ... " « endl; 
return start [k] ; 

} 

“vector () { delete!] start; } //destructor 
private : 

T* start; 

>; 

int mainQ { 
vector x(3) ; 

cout « "x[2]: " « x[2] « endl; 

x[2]= 5; //Note, calling operator!] creates output (see below) 
cout « "x!2]: " « x!2] « endl; 

const vector y(3) ; //the same as vector const y(3) ; 
cout « "y!2]: " « y!2] « endl; 

// y !2]= 5; //would lead to a compile time error: 

//The left operand cannot be assigned to 

return 0 ; 

} 

/ * Output : 

x !2] : □ without const . . . 

2 

□ without const . . . 
x !2] : □ without const . . . 

5 

y !2] : !] with const . . . 

2 

*/ 



In contrast to the older C-XSC versions C-XSC 2.0 uses additional helper 
classes intvector_slice, rvector_slice, ivector_slice, cvector_slice, 
civector_slice, l_rvector_slice, l_ivector_slice, intmatrix_slice, 
intmatrix_subv, rmatrix_slice, rmatrix_subv, 
imatrix_subv, cmatrix_slice, cmatrix_subv, 
cimatrix_subv, l_rmatrix_slice, l_rmatrix_subv, 
l_imatrix_subv to implement subvectors and subarrays. 

The following program shows how the first row and the first column of a real 
matrix may be modified calling a function called testf ct. The formal parameter 
of this function must be of data type rmatrix_subv. 



imatrix_slice, 

cimatrix_slice, 

l_imatrix_slice, 
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#include <iostream> 

#include "rmatrix.hpp" //C-XSC header for real matrices 

//header for real vectors is included 

automatically 

using namespace std; 
using namespace cxsc; 

void testfct (const rmatrix_subv& y) //pay attention to the data type of 
//void testfct (const rvector& y) an error message or a warning would be 
// generated by actual compilers 

{ 

for (int i=Lb(y); i<=Ub(y) ; i++) y[i]= i; 

> 

int main (void) 

{ 

rmatrix M; //M is a real matrix 

int dim; 

cout « "Dimension = cin » dim; 

Resize(M,dim,dim) ; //create M with dim rows and dim columns 
M= 1; //set all elements of M to 1 



cout « "Matrix M:" « endl « M « endl ; 

testf ct (M [1] ) ; //M[l] means the first row of M 

cout « "Matrix M:" « endl « M « endl; 

testf ct (M [Col(l) ]) ; //M[Col(l)] means the first column of M 
cout « "Matrix M:" « endl « M « endl; 

M[Col(l)]= 9; //set all elements of column 1 to 9 

cout « "Matrix M:" « endl « M « endl; 

return 0 ; 



/ * Output 



Dimension = 3 
Matrix M: 



1.000000 

1.000000 

1.000000 



1.000000 

1.000000 

1.000000 



1.000000 

1.000000 

1.000000 



Matrix M: 

1.000000 2.000000 3.000000 

1.000000 1.000000 1.000000 

1.000000 1.000000 1.000000 
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Matrix M: 



1.000000 


2.000000 


3.000000 


2.000000 


1.000000 


1.000000 


3.000000 


1.000000 


1.000000 


itrix M: 


9.000000 


2.000000 


3.000000 


9.000000 


1.000000 


1.000000 


9.000000 


1.000000 


1.000000 



*/ 



5 Examples 

In this section we give a couple of complete sample codes to demonstrate the 
usage and several features of C-XSC 2.0. 

5.1 Example: Accurate Summation of Floating-Point Numbers 

Let us start with a very simple demonstration of how the accurate dot prod- 
uct feature may be used to get accurate results when summing up floating- 
point numbers of very different orders of magnitude. The C-XSC routine 
accumulate (a, x,y) computes a+x*y without any error. Here x and y are 
floating-point numbers and a is a variable of type dotprecision (a so called 
long accumulator): 

//Severe cancellation when computing the sum of three numbers 
//Using a dotprecision variable results in the correct result 

#include <iostream> //C++ input and output 

#include "dot.hpp" //make available C-XSC’s accurate dot product feature 
using namespace std; 

using namespace cxsc; //make available C-XSC names without cxsc : : 
int mainO { 

const real large(l . 23e35) ; //create a large number 

dotprecision a(0) ; //a is a dot precision variable initialized by 0 
accumulate (a, 1.0, large); //a = 1.0*large = 1.23e35 

cout « a « endl; 

accumulate (a, 1.0, 1.5); //a = 1.0*large + 1.0*1. 5 

// = 1.2300. . . 015e35 

accumulate (a, -1.0, large); //a= large + 1.5 - large = 1.5 
cout « "Final correct result is" << a << endl; 

cout « "Naive floating point evaluation gives" « endl 
« " the totally wrong result" 

« large + 1.5 - large « endl; 
return 0 ; 
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} 

/ * output : 

1 . 2300000000E+0035 

Final correct resultis is 1.5000000000 
Naive floating point evaluation gives 
the totally wrong result 0.000000 

*/ 

The possibility to compute dot products of floating point vectors accurately 
is the key for the implementation of matrix/vector operations of maximum 
accuracy in C-XSC. This feature is also used extensively in defect correction 
steps of iterative schemes. The operations for the staggered data type (multiple- 
precision) available in C-XSC [9] are heavily based on accurate dot product 
computations. 

5.2 Example: Accurate Evaluation of Arithmetical Expressions 

The following arithmetical expression has been used by Loir and Walster [12] as 
an example in which numerical evaluations using IEEE 754 arithmetic gave a 
misleading result, even though use of increasing arithmetic precision suggested 
reliable computation (the expression is a rearrangement of Rump’s original ex- 
ample given in [17]). Evaluating 

/(a, b) = (333.75 - a 2 )b 6 + a 2 (lla 2 b 2 - 121 b 4 - 2) + 5.56 8 + ^ (1) 

for a = 77617 and b = 33096 using 32-bit, 64-bit, and 128-bit round-to-nearest 
IEEE-754 arithmetic produces: 

32-bit: / = 1.172604 

64-bit: / = 1.1726039400531786 

128-bit: / = 1.1726039400531786318588349045201838 

However, the correct result is -0.8273960... 

To compute a sharp enclosure of /(a, b) we use the staggered interval arith- 
metic available in C-XSC. 

#include <iostream> 

#include "l_interval .hpp" //staggered intervals (multi-precision 

intervals) 

using namespace cxsc; //make available routines from namespace cxsc 

using namespace std; 

l_interval f ( const l_interval& a, const l_interval& b ) 

{ 

l_interval z; //multi-precision interval 

z = (333.75 - power (a, 2) ) *power (b ,6) + power (a, 2) * (11 . 0*power (a, 2) 

*power(b,2) - 121 . 0*power (b,4) - 2.0) + 5 . 5*power (b, 8) + a/(2.0*b); 
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return(z) ; 

} 



int main( ) 
{ 



l_real 


a, b 


; //multi-precision reals 


l_interval res; 


//multi-precision interval 


real 




Eps ; 






cout 


« 


"Enter 


the arguments : " 


« endl ; 


cout 


« 


" a = 


" ; cin » a; 


//read a multi-precision real 


cout 


« 


" b = 


" ; cin » b ; 




cout 


« 


endl ; 






cout 


« 


"Desired accuracy: Eps 


= "; cin >> Eps; 


cout 


« 


endl ; 






cout 


« 


"Evaluation of (333.75 


-a~2)b~6+a~2(lla~2b~2-121b"4-2) 






+5 . 5b~ 


8+a/ (2b) " 






« 


endl « 


endl ; 





stagprec=0; 
do { 

stagprec++ ; 

res = f (l_interval(a) , l_interval (b) ) ; 

//Output format via dotprecision 

cout << SetDotPrecision(16*stagprec, 16*stagprec-3) ; 
cout << "Interval enclosure: " << res « endl; 
cout << SetDotPrecision(5 , 2) ; 

cout << "Diameter: " << diam(res) << endl; 

} while (diam(res) >Eps) ; 

return 0 ; 

> 

/* Output 

Enter the arguments: 
a = 77617 
b = 33096 

Desired accuracy: 

Eps = le-100 

Evaluation of (333.75 -a~2)b~6+a~2(lla~2b~2-121b~4-2)+5 . 5b“8+a/(2b) 

Interval enclosure: [-3 . 5417748621523E+0021 , 

3 . 5417748621523E+0021] 

Diameter: 7.08E+0021 

Interval enclosure: [-6 . 55348273960599472047761082650E+0004, 
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Diameter : 



1 . 17260394005317869492444060598] 
6 . 55E+0004 



Interval enclosure : 
Diameter : 



[-0.827396059946821368141165095479816291999033116, 
-0 . 827396059946821368141165095479816291999033115] 
2 . 74E-0048 



Interval enclosure : [-0 . 827396059946821368141165095479816291999033115 

7843848199178149, 

-0 . 827396059946821368141165095479816291999033115 
7843848199178148] 

Diameter: 1 . 52E-0064 



Interval enclosure : [-0 . 827396059946821368141165095479816291999033115 

78438481991781484167270969301427, 

-0 . 827396059946821368141165095479816291999033115 
78438481991781484167270969301426] 

Diameter: 1 . 69E-0080 



Interval enclosure : [-0 . 827396059946821368141165095479816291999033115 

784384819917814841672709693014261542180323906 



213, 

-0 . 827396059946821368141165095479816291999033115 
784384819917814841672709693014261542180323906 

212 ] 

Diameter: 1.87E-0096 



Interval enclosure : [-0 . 827396059946821368141165095479816291999033115 

784384819917814841672709693014261542180323906 

2122310853275320281, 

-0 . 827396059946821368141165095479816291999033115 
784384819917814841672709693014261542180323906 
2122310853275320280] 

Diameter: 2.08E-0112 



The last enclosure is accurate to more than 110 digits (that is to all digits 
printed) . 

Let us now solve the same problem (1) (example from Rump/Loh & Walster) 
with the toolbox algorithm for the accurate evaluation of arithmetical expres- 
sions: 

#include <expreval .hpp> //Expression evaluation 



using namespace cxsc; 
using namespace std; 

Staggered f ( StaggArrayfe v ) 

{ 



Staggered a, b; 




a = v [1] ; 
b = v [2] ; 
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return ( (333 . 75 - Power(a,2))*Power(b,6) + Power (a, 2) * (11 . 0*Power (a,2) 
*Power(b,2) - 121 . 0*Power (b ,4) - 2.0) + 5.5 * Power(b,8) + a/(2.0*b)); 

} 

int main ( ) 

{ 

real Eps , Approx; 

int StaggPrec, Err; 

rvector Arg(2) ; 
interval Enel; 

cout « SetPrecision(23, 15) « Scientific; //Output format 

cout « "Evaluation of (333.75 -a~2)b~6+a~2(lla~2b~2-121b"4-2) 

+5 . 5b~8+a/ (2b) 11 
« endl « endl ; 

cout « "Enter the arguments:" « endl; 
cout « " a = " ; cin » Arg[l] ; 

cout « " b = "; cin » Arg[2]; 

cout « endl ; 

cout « "Desired accuracy: Eps = " ; cin » Eps; 

cout « endl ; 

Eval(f, Arg, Eps, Approx, Enel, StaggPrec, Err); 
if (!Err) { 

cout « "Floating-point evaluation 
cout « "Interval enclosure: 
cout « "Defect corrections needed 

> 

else 

cout « EvalErrMsg(Err) « endl; 
return 0; 

> 

/* Output 

Evaluation of (333.75 -a~2)b~6+a~2(lla~2b~2-121b~4-2)+5 . 5b“8+a/(2b) 

Enter the arguments: 
a = 77617 
b = 33096 

Desired accuracy: Eps = le-15 



" « Approx << endl; 

" « Enel « endl ; 

" « StaggPrec « endl; 




28 



W. Hofschuster and W. Kramer 



Floating-point evaluation: 1 . 172603940053179E+000 

Interval enclosure: [-8 . 273960599468215E-001 , 

-8 . 2739605994682 13E-001] 

Defect corrections needed: 2 

*/ 



Again, the computed interval enclosure is sharp. 



5.3 Example: Linear System of Equation 

We want to solve the (ill-conditioned) system of linear equations Ax = b with 

/a n a 12 \ _{ 64919121 —159018721 \ , _ f bA _ f l\ _ f xA 
V°2i « 2 \41869520.5 -102558961 ) ’ \b 2 ) ~ ^0^ ’ x ~ \x 2 ) 

The correct solution is X\ = 205117922, x 2 = 83739041. 

To solve this 2x2 system numerically we first use the wellknown formulas 



022 O21 

x\ = , x 2 = 

0 11022 — 012021 011022 — 012021 

The following ANSI-C program 

#include <stdio.h> 



(2) 



int main (void) 

1 

double all= 64919121.0, al2= -159018721.0, 
a21= 41869520.5, a22= -102558961.0, 
hi, h2, xl, x2 ; 

hl= all*a22 ; 
h2= al2*a21; 
xl= a22/(hl-h2); 
x2= -a21/(hl-h2) ; 

printf("xl= %15f x2= "/,15f\n", xl, x2) ; 

return 0 ; 



produces the totally wrong result 

xi = 102558961, x 2 = 41869520.5. 

I. e. using IEEE double-arithmetic to evaluate the formulas (2) shown above give 
meaningless numerical results. 

We now try to solve the linear system using Matlab. 

Here we compute the inverse matrix (theoretically, the first column of the inverse 
is the solution of the linear system) 
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» inv(A) 

Warning: Matrix is close to singular or badly scaled. 

Results may be inaccurate. RCOND = 1 . 651447e-17 . 

ans = 

106018308 . 007132 -164382474.017831 

43281793.0017831 -67108864 

» A*inv(A) 
ans = 

0 2 

-1 2 

» inv(A) *A 
ans = 

1 2 

0 1 

A*inv(A) as well as inv(A)*A should give the identity matrix. Obviously, the 
computed results are again not reliable. But this time we get at least a warning 
from Matlab. 

If we try to compute an enclosure of the solution vector x using Rump’s 
IntLab package [18] 

x = verifylss(A,b) 

we get the same warning as in Matlab (indeed it is the Matlab warning) and the 
output 

No inclusion achieved, 
x = 

NaN 

NaN 

IntLab is not able to solve the system. No meaningless numerical values are 
produced. 

Let us now try to solve our ill-conditioned problem using C-XSC. Calling 
the solver for systems of linear equations from the Toolbox library [2] (using the 
interactive toolbox example program lss_ex) we get the following enclosure of 
the solution: 

Enter the dimension of the system: 2 

Enter matrix A: 

64919121 -159018721 
41869520.5 -102558961 

Enter vector b: 

1 0 



Naive floating-point approximation: 
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2 . 051 179220000000E+008 
8 . 373904100000000E+007 



Verified solution found in: 

[ 2 . 051179220000000E+008 , 2 . 051179220000000E+008] 

[ 8 . 373904100000000E+007 , 8 . 373904100000000E+007] 

Condition estimate: 1.2E+017 

The computed result is the correct solution (internally the toolbox routine makes 
use of the accurate dot product evaluation available in C-XSC). 



5.4 Example: Cauchy Principal Value Integral 

The freely available package CL AVIS (Classes for verified Integration over 
Singularities) has been developed and implemented using C-XSC by Wedner 
as part of his thesis [20] . This package allows the computation of enclosures for 
definite integrals of several kinds (Riemann, Cauchy principal values, . . .). 

Let us start with two definitions: 

The Cauchy principal value integral /(/; A) is defined as follows 

/(/;A) := /"A4 i x: = l im ( C"' Afl dx + f ^ dx) , A €(«,!>) 
Ja X - A e->0+ yj a X-X J x+e X-X J 



and / € C 2n+1 [a, b ]. 

The nested integral /(/; A , /j) is defined in the following way: 




f(x,y) 
-X)(y- n) 



dy dx , 



A € (a, b), y £ (c, d). 



We now compute an enclosure of the nested integral 



\ m ) — j- J 



i sin(e a;2 ) sin(e !/2 )e ;:c2+!/2 
(x- X)(y-ju,) 



d y dx 



with A = 1.25 and n = 1.5 using the CLAVIS library. The header file 
"cubature.h" belongs to the CLAVIS library. To be able to link the program 
cubature . o must be linked. The following program also demonstrates how ex- 
ceptions may be handled. 



#include <iostream> 

#include "cubature.h" //don’t forget to link cubature. o 

//source code of this program is assumed to be in the clavis directory 

using namespace std; 
using namespace cxsc; 

// cauchy x cauchy integral (using cauchy x cauchy formula) 

// 

// f(x,y) = sin(exp(y*y) ) * exp(y*y) * sin(exp(x*x) ) * exp(x*x) 
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// 

// complete integrand of I(f; lambda, mu): f(x,y) / ((x-lambda)*(y-mu)) 

// 

// 



int main() { 
try { 

operand r( exp(sqr(y)) ), s( exp(sqr(x)) ); 
integrand f = sin(r) * r * sin(s) * s; 

double lambda=1.25; //singularity in x direction 
double mu=1.5; //singularity in y direction 

double xlb=l, xub=2; //x-range of integration 
double ylb=l, yub=2; //y-range of integration 
double eps= le-6; //required accuracy 

cauchy_integral example(f, lambda, mu); 

//compute an enclosure of I(f; lambda, mu): 
example . integrate (xlb, xub, ylb, yub, eps) ; 
cout << SetPrecision(8,2) « Scientific 

« "Required max. diameter of remainder: " << eps << endl 

« SetPrecision(16, 12) « example « endl; 

}//try 

catch (integrand :: error e) 

{ cout << " formelgen. " « e.i « endl; } 

return 0; 

}//main 



/ * Output : 



Required max. diameter of remainder: le-06 

number of intervals : 109 (44) 

#f : 17233 



approx imationsum : [-7 . 6237054671070354E+001 , -7 . 6237054670795458E+001] 
d(approximationsum) : 2.7489477361086756E-010 



remainder 
d (remainder) 



[-4 . 9415981455851922E-007 ,4 . 9416704156171493E-007] 
9 . 8832685612023414E-007 



enclosure 

d(enclosure) 



[-7 . 6237055 165230175E+001 , -7 . 6237054176628404E+001] 
9 . 8860176933612820E-007 



*/ 



The output shows, that 

{ { (x — 1.25)(i/ — 1.5) dy dx 6 [~ 76 - 2370552 ’ -76.2370541], 

This result is guaranteed by the algorithm itself. 



5.5 Example: Time Measurements 

We are frequently asked for timings. Here we give a frame for time measurements. 
The source code can be modified in an obvious way to do timings for other 
operations and functions. 

//Simple frame for time measurements 
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#include <iostream> 

#include <ctime> //clockO 

#include "interval .hpp" //interval operations 

#include "imath.hpp" //elementary functions for interval arguments 

using namespace std; 
using namespace cxsc; 

void start_clock(clock_t& tl) ; //function to start the timer 
void print_time_used(clock_t tl) ; 

int mainQ 

{ 

long iMax= 100000; 

cout « "Number of repetitions: "« iMax « endl; 
interval x(200 . 0 ,200 . 001) ; 
clock_t t; //defined in <ctime> 

cout « "Elementary function calls ..." « endl; 
start_clock(t) ; 
for(long i=0; iCiMax;) 

{ 

x= ln(exp(atan(sin(cos (x) ) ) ) ) ; 
i++; //avoid compiler optimization 

} 

print_time_used(t) ; 

> 

void start_clock(clock_t& tl) 

{ 

tl= clockO ; 

if (tl == clock_t(-l)) //terminate if timer does not work properly 

{ 

cerr << "Sorry, no clock\n"; 
exit (1) ; 

} 



void print_time_used(clock_t tl) 

{ 

clock_t t2= clockO ; 
if (t2 == clock_t(-l)) 

{ 

cerr« "Sorry, clock overf low\n" ; 
exit (2) ; 

} 

cout « "Time used: " << 1000*double (t2-tl) /CL0CKS_PER_SEC 
« " msec" << endl; 
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Results computed on a SUN Ultra 60 Workstation running Solaris 7 
using GNU C++ Compiler Version 3.2 without any optimization: 

Number of repetitions: 100000 
Elementary function calls . . . 

Time used: 1370 msec 
*/ 

Note that the given frame for time measurements is not so appropriate to mea- 
sure very short or very long execution times. For further timing results we refer 
to [21]. 

6 Current Work on C-XSC 

— Finish the final version C-XSC 2.0 (the actual version is: Betarelease 2 from 
December 2002) 

— Modify the sources in such a way that C-XSC will run with more C++ com- 
pilers (e.g. with SUN Forte, Compaq, other compilers available for Windows 
systems; up to now C-XSC 2.0 only runs with GNU C++ compilers from 
version gcc 2.95.2 to version gcc 3.2.) 

— Adaptation and completion of the C-XSC test suite to more C++ compiler 
versions (most C++ compiler do not conform completely to the C++ stan- 
dard. This still causes problems when using the already existing rudimentary 
test suite. Meanwhile the installation of the C-XSC library is checked in the 
following way: Install also the numerical toolbox and look whether the tool- 
box programs deliver correct results. If the computed results are equal to 
the prestored correct values it is assumed that the C-XSC installation was 
successful.) 

— Improve performance: due to the extensive use of the C-l — I- exception han- 
dling, the extensive use of template classes, and the extensive use of function 
inlining it is (up to now) not possible to compile C-XSC with the GNU Com- 
piler using e. g. the compiler option -03 as optimization level 

— For historical reasons C-XSC is build on emulations for several basic floating 
point operations. This makes the actual C-XSC run time system portable 
but slow compared to the speed of hardware operations. Nowadays most 
processors conform to the IEEE 754 standard. So, fast hardware operations 
are available for all rounding modes. These operations will be used in forth- 
coming C-XSC versions (at least for special processors like Intel and SUN) 

— A thorough documentation of the routines available in C-XSC will be pre- 
pared. This is important because due to significant modifications concerning 
C++ most available documentation is no longer up to date 

— Simplification and redesign of the runtime system (RTS). The RTS com- 
prises rounding control, reliable input/output routines, routines to compute 
accurate dot products for data types real, complex, interval, and complex 
interval, . . . 

— Development and implementation of parallel versions of selfverifying solvers 
based on C-XSC and MPI on cluster computers 
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Abstract. As interval analysis-based reliable computations find wider 
application, more software is becoming available. Simultaneously, the 
applications for which this software is designed are becoming more di- 
verse. Because of this, the software itself takes diverse forms, ranging 
from libraries for application development to fully interactive systems. 
The target applications range from fairly general to specialized. 

Here, we describe the design of four freely available software systems 
providing validated computations. Oishi provides Slab, a complete, high- 
performance system for validated linear algebra whose user interface 
mimics both Matlab’s M-files and a large subset of Matlab’s command- 
line functions. In contrast, CADNA (Fabien Rico) is a C++ library 
designed to give developers of embedded systems access to validated 
numeric computations. Addressing global constrained optimization and 
validated solution of nonlinear algebraic systems, Kearfott’s GlobSol fo- 
cuses on providing the most practical such system possible without speci- 
fying non-general problem structure; Kearfott’s system has a Fortran-90 
interface. Finally, Neher provides a mathematically sound stand-alone 
package ACETAF with an intuitive graphical user interface for comput- 
ing complex Taylor coefficients and their bounds, radii of convergence, 
etc. 

Overviews of each package’s capabilities, use, and instructions for 
obtaining and installing appear. 

Keywords: Validated computations, numerical linear algebra, embed- 
ded systems, Taylor series, interval arithmetic, stochastic arithmetic, 
global optimization, interactive software systems, software libraries 
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1 Introduction 

This work describes four diverse but well-developed validated computing pack- 
ages packages: Slab (Shin’ichi Oislri), CADNA (Fabien Rico), GlobSol (R. Baker 
Kearfott), and ACETAF (Markus Neher). Slab, based on Matlab syntax, pro- 
vides, in validated and interval form, many of the matrix operations and func- 
tions available in Matlab; Slab implements a novel, well-thought out scheme of 
directed rounding to efficiently achieve this result. CADNA implements both 
interval arithmetic and a type of stochastic arithmetic. GlobSol, containing a 
traditional but portable implementation of interval arithmetic, is meant for val- 
idated solution of general unconstrained and constrained global optimization 
problems. Finally, ACETAF focuses on computation of error bounds for Taylor 
coefficients. Slab provides a user interface that is identical, with some purposeful 
exceptions, to the familiar Matlab syntax, while ACETAF provides a convenient 
graphical user interface. CADNA consists of a C++ library for programmers, 
while GlobSol, although containing Fortran 90 libraries that are separately us- 
able, can be used as a stand-alone system in which the users input problems 
with standard Fortran syntax. 

Details for Slab appear in §2 below, while details for CADNA appear in §3, 
details for GlobSol appear in §4, and details for ACETAF appear in §5. We give 
a short overall summary in §6. 



2 Slab (Shin’ichi Oishi) 



2.1 Introduction and Overview 



S. Oishi and S. M. Rump have developed a new verification method called round- 
ing mode controlled verification , and have applied this method to simultaneous 
linear equations [41]. It has been shown in [41] that the total cost of calculating 
an approximate solution of a system of n-dimensional simultaneous linear equa- 
tions and of calculating a rigorous error bound is 4/3 n 3 flops. Let us consider 
a computing system which conforms to the IEEE 754 floating point standard. 
Let A and B be n x n matrices whose elements are IEEE 754 double precision 
numbers. Then, we have shown [41] that an inclusion of a product of A and B 
can be calculated by 



setround(down) ; 
L = A * B; 
setround(up) ; 

U = A * B; 



Here, MATLAB-like notation is used, and the instructions setround(down) and 
setround(up) mean to change the IEEE 754 rounding mode to -Inf and +Inf, 
respectively. Since we can use the optimized BLAS functions in this calculation 
to calculate a matrix product A*B, in practice this inclusion procedure can be 
executed with just twice as much time as that for calculating A*B using the 
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optimized BLAS. This is the fact we used in [41] to develop our fast algorithm 
to include a solution of a system of n-dimensional simultaneous linear equations. 

Then, in [40], we have shown that verified enclosure of all eigenvalues of ma- 
trices can be computed with less additional time than that required to initially 
compute all approximate eigenvalues and eigenvectors. The method proposed in 
[41] is also based on the rounding mode controlled verification method. More- 
over, it has been shown in the book [39] that the rounding mode controlled 
verification method has wide applicability to a variety of problems of numerical 
linear algebra. 

We developed Slab, a MATLAB-like numerical tool, as a test for these algo- 
rithms. We considered the suitability of several development environments for 
the design of Slab. In particular, to solve functional equations, one should have 
a tool having the following properties: 

— Support for operator overloading (for programming clarity and convenience) 
to handle various objects, such as intervals and automatic-differentiation, 
needed for verification. 

— Access to instructions for changing the rounding mode. 

— Availability of optimized BLAS routines for solving large problems. 

We have examined various numerical tools with regard to these criteria. MAT- 
LAB 6.x satisfies all the requirements listed above. In fact, Rump has imple- 
mented the MATLAB toolbox INTLAB 

(http://www.ti3.tu-harburg.de/~rump/intlab/), which has interval arith- 
metic, validated elementary functions, and rounding mode controlled computa- 
tion. One minor defect when using MATLAB is that part of the source code 
is not open. However, it is known that MATLAB uses LAPACK with the op- 
timized BLAS generated by ATLAS (an open-source project for Automatically 
Tuned Linear Algebra Software, sec http://math-atlas.sourceforge.net/). 

Scilab (http://www-rocq.inria.fr/scilab/) is an another choice. In 
Scilab versions 2.6 and earlier, Scilab uses mainly LINPACK. Thus the level 
three BLAS routines cannot be accelerated, even if one uses an optimized BLAS. 
However, from Scilab version 2.7, Scilab uses LAPACK. Thus, one can use the 
optimized BLAS generated by ATLAS. Moreover, Scilab provides the function of 
operator overloading through the t-list. The instruction of changing the round- 
ing mode can be implemented in Scilab using its “link” and “call” functions of 
C object files. Thus, Scilab 2.7 satisfies all requirements mentioned above. 

Octave (http://www.octave.org/) also is a candidate. Although it uses 
LAPACK almost optimally, Octave does not have operator overloading. The in- 
struction of changing the rounding mode can be implemented through an octfile. 

R.LAB (http://rlab.sourceforge.net/) is also a good choice. It uses LA- 
PACK. Its grammar is similar to that of C-language. It seems that R.LAB has 
not yet implemented user defined instructions. However, one can introduce easily 
a rounding-mode-clranging instruction by directly rewriting its source code to 
add such an instruction. 

Based on these observations, we think that it is useful to introduce a new 
small language designed for verification. For these reasons, we have developed 
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Slab, a new MATLAB-like interpreter. Slab’s grammar is mixture of MATLAB 
and R.LAB. It uses LAPACK, so it can be accelerated by an optimized BLAS. 
Slab has a unique feature, a verification mode. Namely, based on a recent re- 
sult of the author, it provides verified results for solutions of simultaneous linear 
equations, eigenvalue problems of matrices and many standard problems in nu- 
merical linear algebra. Slab’s instructions and operator overloading function are 
implemented by directly rewriting its source code. Slab is free software based on 
the GNU-license, and is downloadable from the site 

http : //www. oishi . info . waseda. ac . jp/~oishi/ index . html 

Slab can be installed on a Redhat 7.2 based Linux PC with a Pentium CPU. 
Moreover, with a little modification, it can be installed on Windows using Cyg- 
win or on a Macintosh with OS X. 



2.2 Overview of Slab 

In this section, we give an overview of Slab. 

1. Slab is a MATLAB-like numerical tool designed for verified numerical com- 
putation. 

2. Slab has many new features suitable for verified numerical computation. For 
example: 

a) Rounding instructions: up(), down(), and near() are defined for rounding 
toward infinity, toward -infinity and to nearest, respectively. 

b) There is a validation mode. To enter the validation mode, type 

c) In validation mode, the solution of Ax = b can be obtained with error 
bound by typing x=A\b. 

3. Slab has the built-in functions: 

sin, cos, loglO, In (log), exp, abs (f abs) , 
tan, atan, acos, asi sinh, tanh, sqrt 

In approximation mode, they are coincide with C’s built-in functions. In ver- 
ification mode, although they return values calculated by multiple precision 
routines, their return values are still not verified. 

4. The user can define functions by 

function func_name(a,b, . . . ,z) {expr;expr; . . . ;expr}, 

where the expr represent expressions. 

5. The instructions for, while and if can be used. 

6. Matrices can be treated. 

7. The imaginary unit should be introduced with i=sqrt(-l). 

8. The interval [a, 6] can be entered with interval (a, b) . 



We now explain several Slab instructions in more detail: 
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function. The instruction “function” makes a user defined function: As an ex- 
ample, we present a program here for inclusion of a solution to 

Ax = b. 

Here, A is an n x n real matrix and & is a real n-vector. 

Algorithm 1: Inclusion algorithm for matrix equations 

function f(A,b,n) { 

R=inv(A) ; 
x=R*b ; 
down() ; 

U=R*A-eye (n) ; 
s=A*x-b ; 
up() ; 

V=R*A-eye (n) ; 
t=A*x-b ; 
up() ; 

r=int (s ,t) ; 

T=int (U, V) ; 
d=abs(T) ; 

Ar=R*r ; 
ar=abs (Ar) ; 
dd=norm(d) ; 
arr=norm(ar) ; 
e=arr/ (1-dd) ; 

> 

eig. The function eig(A) returns all the eigenvalues and eigenvectors of an n by 
n point matrix A: 

A> A=rand(3) ; 

A> sol=eig(A) 
ans . val = 

|***| 

|***| 

ans.vec = 

|***| 

|***| 

|***| 

In this example, sol. val gives a diagonal matrix whose diagonal elements 
consist of all eigenvalues of A. On the other hand, the n-th column of sol.vec 
is an eigenvector of A corresponding to the n-th diagonal element of sol. val. 
This function uses CLAPACK functions with optimized BLAS functions: 

— For a real symmetric A, dsyev_ is used. 
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— For a real general A, dgeev_ is used. 

— For a Hermitian A , zheev_ is used. 

— For a general complex A, zgeev_ is used, 
interval. The interval instruction is used to make an interval. The object a 
and b can be doubles or matrices. 

A> a= int erval (3,5) 
ans = 

[3,5] 

A> A = rand (2) ; 

A> Z = [A.A+0.1] 
ans = 

I [0.3 , 0.4 ] [ -0.1 , 0 ] I 

I [0.2 , 0.3 ] [ 0.1 , 0.2 ] I 

Addition, subtraction, multiplication and division are overloaded, 
read. In Slab, files having a name like “filename. s” are called s-files. If Slab 
commands are written in s-files, then such an s-file can be read as 

A> read filename. s 

The following is an example: 

shell> cat test.s 
a = 3 
b = 5 
c = a + b 
d = a * b; 
shell> Slab 
Welcome to Slab! 

A> read test.s 
ans = 

3 

ans = 

5 

ans = 

8 

A> d 
ans = 

15 

solve. The function solve (func ,x) is a one-dimensional nonlinear equation 
solver based on Newton’s method. Here, func is the name of function defined 
by func=name(f ), where a user-defined function f(x) is defined separately. 
One then types solve (func,x) to solve the nonlinear equation 

f{x) = 0. 

Here, x is an initial guess of a solution. The following is an example: 
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A> # Since function ’solve’ is defined in s-file, 

A> # one should first read s-file ’f solve. s’ by 
A> read f solve. s 

A> # Then, define a nonlinear function. 

A> function f(x) { 

A> a_= sin(sin(x))-0.5; 

A> > 

A> # Then, solve f(x)=0. 

A> x= [1] ; 

A> a=name(f ) ; 

A> y=solve(a,x) 
ans = 

0.55106958309945 

linpro. The instruction linpro(c,C,b) is an interface to the GNU routine 
lp_solve and solves the following linear programming problem: 

max : c ’ x ; 
subject to 

Cx <= b; 
x >= 0 ; 

Here, c is an n-dimensional objective vector, C an m x n matrix, and b is an 
m-dimensional right hand side vector. Here is an example: 

A> c = [-1,2] ; 

A> C = [2 , 1 ; -4,4] ; 

A> b = [5,5] ; 

A> linpro(c ,C,b) 

Value of objective function: 3.75 
xO 1.25 

xl 2.5 

The function linpro is an interface to the GNU routine lp_solve. 

In addition to the instructions listed above, the functions 

chol, do-while, eval, fft, for, getbits, if, ifft, 
inv, linspace, lread, lu, max, name, ode, plot, 
print, qr, quad, save, schur, svd, while, who 

and others are implemented in Slab. 

Finally, we shall describe a bit about Slab’s verification mode. Slab has three 
operation modes: help mode, approximation mode and verification mode. Slab’s 
prompts H>, A> and V> are assigned for help mode, approximation mode and 
verification mode, respectively. In each mode, if we type help, $ or ! , then 
Slab’s mode changes to help mode, approximation mode and verification mode, 
respectively. 
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The instructions A\b and eig(A) behave differently according to the mode. 
Here, A is an n x n matrix and b is an n-vector. In approximation mode, the 
instructions A\b and eig(A) have the same meaning as those in MATLAB. In 
verification mode, they also return error bounds, if possible. 



3 CADNA (Fabien Rico) 

Fixed CADNA 1 is a C++ library designed to give to developers tools for 
estimating the quality of numerical results of embedded codes. Embedded archi- 
tectures are low cost solutions that can be found everywhere, such as in cars, 
planes or cellular phones. These architectures are generally based on a simple 
processor that performs computations with fixed point numbers. They have to 
manage increasingly complex programs. Because many embedded systems are 
critical, there is a growing need for numerical validation of the results produced 
by such systems. 

The Fixed CADNA library has been developed to help the designer of em- 
bedded code to define the fixed point arithmetic that fits the problem to be 
solved. More precisely, Fixed CADNA helps the designer seek the optimum dy- 
namical range of the variables of the program (from the memory size point of 
view), to seek the numerical instabilities of his algorithm and to validate the 
result produced. This library is composed of a set of classes which can be sub- 
stituted for the float and double type, and a graphical user interface. Thus, the 
embedded program designer can run code at the same time on several fixed arith- 
metics. The core of Fixed CADNA is stochastic arithmetic and the CESTAC 
method, which has been successfully used in the CADNA library for validating 
floating point code [4]. 

Section 3.1 presents our models of fixed point, interval, and stochastic arith- 
metic. Section 3.2 describes the library and the facilities offered to the pro- 
grammer. Our library produces a log file that shows the numerical instabilities 
produced by the embedded code. This log file format is shown in section 3.3. 
Finally, the graphical user interface is described in section 3.4. 



3.1 Arithmetic Models 

We characterize fixed point representations with three values that define the 
dynamical range and the precision: 

— The precision s £ {0, 1, . . . , 31} represents the number of bits of the number. 

— The position p £ {0, 1, . . . ,p} indicates the number of digits after the point. 

— The scale e £ Z is for scaled fixed point representations 



1 Acknowledgment: This section is the description of joint work from the ANP team 
of the LIP6 laboratory at Universite Pierre et Marie Curie. Special thanks go to 
Jean-Marie Chesneaux and Laurent-Stephane Didier with whom this project as been 
developed. 
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Next, each number in the fixed point representation is defined by two values: 

— the sign £ € {—1, 1}, 

— the integer mantissa m £ {0,1,..., 2 s }. 

Thus, the value of a number X is given by the following formula: 



X = exmx 2 e ~ p . 



(1) 



This formula is similar to the formula that gives the value of floating point 
numbers, but in equation (1), the exponent e — p is fixed. 

Building on the fixed point representation, the Fixed CADNA library gives 
two additional representations: 

— the interval fixed point representation, consisting of an interval composed of 
two fixed point numbers. It is adapted from classical interval arithmetic [27]. 

— the stochastic fixed point representation, using the CESTAC method for 
estimating the accuracy of a number. 

The aim of the CESTAC [48,49] method, based on the probabilistic approach 
to round-off errors, is to estimate the effect of propagation of round-off errors on 
every computed result obtained with a finite arithmetic. It consists of making 
the round-off errors propagate in different ways to distinguish between a stable 
part of the mantissa 2 , considered the significant part, and an unstable part 3 , 
considered non-significant. 

The first basic idea of the CESTAC method is to replace the usual finite 
arithmetic by a random arithmetic. The random arithmetic is obtained from 
the usual finite arithmetic by randomly perturbing the lowest-weight bit of the 
mantissa of the result of each arithmetic operation. The second basic idea is to 
run a code several times with this new arithmetic to obtain different results for 
each run. 

In practice, the use of the CESTAC method consists of: 



1. running the same program N times in parallel with the random arithmetic; 
consequently, for each intermediate result R of any finite arithmetic opera- 
tion, a set of N different computed results f?.j, i = 1, .., N is obtained, 

2. taking the mean value R = 1 as the computed result, 

3. using Student’s distribution to estimate a confidence interval for R, and 
then computing the number C r of significant bits of R (i.e. the common bits 
between R and the exact result r) defined by 



Cr = !og 2 




with s = 



JV-l 



T /3 is the value of Student’s variable t for N — 1 degrees of freedom and prob- 
ability (3. The major interest of this method comes from the small available 

2 the part of result that doesn’t change with different propagations 

3 the part of the result that depends on the round-off error 
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values of N. In practice, N = 2 or 3 are sufficient to obtain an accurate 
enough confidence interval. 

The validity of this method has been proved under hypotheses which gener- 
ally hold in real-life problems [3]. The hypotheses can be controlled during the 
run. 

The primary application of the CESTAC method is to compute the num- 
ber of exact significant bits of computed results, but the capability of knowing 
the accuracy of results leads to a new arithmetic: stochastic arithmetic [5,6, 
49]. Stochastic arithmetic may also be seen as a model of a finite arithmetic 
with accuracy control. In stochastic arithmetic, order relations and the notion 
of equality are redefined to take into account the accuracy of operands. 

For instance, two values will be stochastically equal if their difference is only 
due to round-off error propagation. For the order relation, a value will be strictly 
greater than another value if it is significantly greater than the other. On the 
other hand, a value will be greater or equal to another value if it is greater than 
the other or if their difference is only due to round-off error propagation. 

Discrete Stochastic Arithmetic (DSA) is the joint use on a computer of the 
synchronous implementation of the CESTAC method and the stochastic defini- 
tions of order and equality relations. DSA enables one to estimate the impact of 
round-off errors on any result of a scientific code and also to check that no nu- 
merical instability occurred during the run, especially in branching statements. 
Moreover, the ability to estimate the numerical quality of any intermediate re- 
sult leads to a true dynamical numerical debugging by detecting all numerical 
instabilities while running the code. 

3.2 Using the Library 

The goal of this library is to allow the developer to execute existing C code 
with new types without completely rewriting it. A simple mechanism for easily 
substituting types is to consider these types as objects having the same interface. 
Thus, our C++ library is composed of a set of classes defining new types that can 
be substituted for float and double C types. 

In practice, a generic type REAL is used for every variable whose type is 
changed. Next, it is necessary to include the header file corresponding to the 
chosen representation and compile the code. This inclusion associates the chosen 
fixed number representation to the generic type REAL. All computations on the 
REAL variables are performed with the selected representation. 

Because all the types defined in our library are parameterized by the size 
s, the number of digits in the fractional part p, and the scale e, each variable 
with the generic type REAL must be declared with at least these parameters. This 
declaration constructs an object that has the properties of the chosen representa- 
tion. Moreover, the REAL constructor may take a value v as an extra parameter, 
that is a double number with which the variable is initialized. 

The arithmetic operations are allowed only between numbers in the same 
representation. This means that operations between two numbers expressed in 
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Table 1. Summary of the C++ library 



Type 


Fixed 


Interval Fixed 


Stochastic Fixed 


Header file 


f ixed.h 


interval .h 


cadna_f ixed.h 


Defined type 


REAL 1 


Initialization 


REAL : : init ( int nb , 
char *log_file) 


REAL :: init (int nb, int div, 
int test, int mul, int lost 
int threshold, char *log_file) 


Constructor 


REAL (int s, int p, int e) 
REAL(int s, int p, int e, double v) 


Operators 


~i J ~ ~ > i l~ f 

+, ++, 

==, !=, <, >, <=, >=, 
« 



a different fixed representation are not permitted. An explicit conversion has to 
be made by the developer by reallocating the value with the operator =. Thus, 
precision losses due to implicit conversion are avoided. Moreover, to manipulate 
constant values, computations with double are allowed. 

At execution time, a log file containing all the numerical instabilities (see 
section 3.3) detected by our library is produced. The programmer can specify 
the name of this log file and the maximum number of instabilities that will be 
detected and noted. Furthermore, it is possible to select the kind of instabilities 
to be logged for the stochastic fixed and interval fixed representations. Thus 
the library is initialized by the function REAL: :init. A full description of this 
function is developed in section 3.3. 

Table 1 summarizes the different functionalities of the new representation 
available in Fixed CADNA. 

The example in Table 2 illustrates the use of our library on a FIR filter C code. 
In this code, the FIR filter computations are performed with a fixed point repre- 
sentation that is parameterized by the user. It has size bits and has nb bits in the 
fractional part. Including the file f ixed . h permits the generic type REAL to repre- 
sent fixed point numbers. The log file is initialized with REAL : : init (-1) . It will 
contain all the instabilities detected and will be named instability^: ixed. log. 
Note that the declaration of single variables and arrays is different. For instance, 
a simple variable output is declared with REAL output (size, nb,0), and the 
array h is declared with REAL *h=new REAL[NTAPS] (size,nb,0). 

3.3 The Log File 

Using a finite arithmetic may lead to several instabilities. Our library has been 
designed to detect the following problems: 

— Overflow, 

— Division by a stochastic zero (only for stochastic representation), 

— Division by an interval containing zero (only for interval representation), 
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Table 2. FIR filter example 



#include <iostream> 

Sinclude "fixed. h" 

void clear (int ntaps, REAL z[]) 

{ 

int ii; 

for (ii =0; ii < ntaps; ii++) { 
z[ii] = 0; 

> 

> 

REAL f ir_basic(REAL input, int ntaps, 
const REAL h[] , REAL z[]) 

{ 

int ii; 

REAL ac cum (input) ; 

z [0] = input ; 
accum = 0; 

for (ii = 0; ii < ntaps; ii++H 
accum += h[ii] * z[ii] ; 

> 

for (ii = ntaps - 2; ii >= 0; ii 
z[ii + 1] = z[ii] ; 

> 

return accum; 

} 

int main (int argc, char *argv[]) 

{ 

#def ine NTAPS 6 
#def ine IMP_SIZE (3 * NTAPS) 



static const double h_base [NTAPS] = 

{ 1.1, 2.2, 3.3, 4.4, 5.5, 6.6 >; 
int size, nb; 

REAL : : init (-1) ; 

size=atoi(axgv[l] ) ; 
nb-atoi (argv [2] ) ; 
static REAL *h=new 

REAL [NTAPS] (size,nb,0) ; 
static REAL *h2=new 

REAL [2 * NTAPS] (size, nb,0) ; 
static REAL *imp=new 

REAL [IMP_SIZE] (size,nb,0); 
REAL output(size,nb,0) ; 
int ii, state, i; 



clear (NTAPS, z) ; 

for (ii = 0; ii < IMP^SIZE; ii++){ 
output = f ir_basic(imp[ii] , 

NTAPS, h, z); 
cout « output << " " ; 

> 



clear (IMP_SIZE, imp); 
imp [5] =1.0; 
imp [6] = 1.5001; 
imp [7] = 1.2; 

for (ii = 0; ii < NTAPS; ii++){ 

— ){ h2 [ii] = h2 [ii + NTAPS] = h[ii] ; 

> 



— Insignificant comparison of two stochastic numbers too close according to 
the accuracy (only for stochastic representation). 

— Comparison between overlapping intervals (only for interval representation) , 

— Multiplication of two stochastic zero numbers, 

Our choice is to never stop the computation and just log the instabilities into 
a file created by the init function. The CESTAC method detects cancellation 
that may occur in an addition or a subtraction. This information is logged in 
the same file. (See section 3.2.) The init function permits specification of the log 
file name and choosing which inconsistencies will be logged: 
static void fixed_st::init(int my_trace_cadna = 10000, 
int div = TRUE, 
int test = TRUE, 
int mul = TRUE, 
int lost = TRUE, 
int threshold = 4, 

char *log_file = "instability_fixed_st.log"); 
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> cd /ho me/ant i loque 1/cadna/ .gtkfixe 
d//Exec/ 

> ,/temp_f ixed 

Entrer TRILLE et NBRCRPRES : 

32 20 

Entner n : 100 
S(100) = 75.0483 

> □ 



Stochastic Fixed Interval Fixed 

> cd /ho me/ant i loque 1/cadna/ . gtkf ixe > cd /ho me/ant iloquel/cadna/.gtkfixe 

d//Exec/ d//Exec/ 

> ./temp_fixed_st > ./temp_fixed_int 

Entrer TRILLE et NBRCRPRES : Entrer TRILLE et NBRCRPRES : 

32 20 32 20 

Entrer n : 100 Entrer n : 100 

S(100) = 0.75049E+02(15) S(100) = [-2048:2048] 

>0 >0 




For each type of arithmetic used, it is possible to select what is logged: 

— stochastic fixed: the arguments div, test, mul, and lost are boolean, ac- 
tivating the log for division by stochastic zero, inconsistent comparison, 
multiplication of stochastic zero, and cancellation, respectively. The inte- 
ger threshold is the threshold used for detecting cancellation, and 
my_trace_cadna is the maximum number of messages displayed. (If 
my_trace_cadna= — 1 , then all messages are displayed.) 

— interval fixed: mul, lost, and threshold are ignored. They are only present 
to obtain the same interface. 

— initial fixed: only the first and the second arguments are used. 
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3.4 The GUI for Parallel Execution 

When using the library, it is necessary to choose a special arithmetic (stochastic 
or interval arithmetic). However, developers may need to test a program with 
all the arithmetics to take advantage of each of them. For this purpose, it is 
interesting to run the same program with different arithmetics in parallel. The 
aim of the Graphical User Interface is to do it easily. 

The main window (see Figure 1) is split into three execution windows (1.3, 
1.4, 1.5). Each of these presents the program result with a different representa- 
tion. Below them, there are the log windows (1.6, 1.7, 1.8) presenting the content 
of the corresponding log file. (See section 3.3.) Those execution and log windows 
may be deleted using the Property entry of the menu 1.1. 

A program is chosen through the menu File>0pen or the button Open 1.2. 
The Compile and Run commands enable execution of the program with the 
different representations. If the running program needs keyboard input, the en- 
try 1.9 enables one to dispatch it on a different process. Finally, the principal 
log window 1.10 keeps a trace of the former commands. 

3.5 Summary 

In this section, we have introduced new method to perform validated numerical 
calculations for embedded applications. 

Numerical validation tools have existed before, but none of those are specif- 
ically designed for embedded applications, because they lack support for fixed 
point representation. Our library tries to fill this gap. It is based on the use 
of a new library that applies various known validation methods to fixed point 
numbers. 

This library is just the first piece of work towards a complete toolbox dedi- 
cated to numerical validation of embedded applications. 

4 GlobSol (R. Baker Kearfott) 

4.1 Introduction 

GlobSol began as a research code to study algorithms for verified Global Op- 
timization. GlobSol grew out of INTBIS [23], a relatively simple FORTRAN-77 
code and ACM Transactions on Mathematical Software algorithm for finding 
all solutions, with validation, to nonlinear algebraic systems. For ease of ex- 
perimentation, simple automatic differentiation, consistent with the relatively 
small problems originally envisioned, was added, and a special technique for 
bound constraints (originally tried in [16]) was implemented. We also provided 
extensive capability for a technique we described in [15], a technique (discov- 
ered independently and probably earlier by others) that has developed into the 
field of “constraint propagation.” One of the first projects done within this en- 
vironment was development of techniques for avoiding the “cluster” problem 
([7], [22], [46]) that occurs in exhaustive search algorithms when the system 
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is ill-conditioned or singular near the global optimum. We also implemented 
a technique for verifying feasible points [19] and thus included a capability for 
handling general equality-constrained problems. (We added separate handling of 
inequality-constrained problems later.) We studied and implemented extensions 
to the idea of interval slopes and slope arithmetic (perhaps first appearing [25]) 
to non-smooth functions, as we explained in [17] and [18, Ch. 6]. 

During this development (roughly from 1993 to 1998), we referred to Glob- 
Sol as INT0PT-90. A collected review of these and other techniques, some new 
theoretical analyses, and a description of the structure of INT0PT-90 appears in 
[18]- 

GlobSol took on its present form (and its present name) as part of a co- 
operative research and development contract funded by Sun Microsystems and 
directed by G. W. Walster (and with extensive participation of George Corliss). 
The most significant advances during this phase of GlobSol’s development are 
perhaps 

— extensive testing and bug-removal (extremely important for software that 

purports to validate), 

— polishing of the user interface, 

— experimentation with GlobSol on a variety of practical problems, and 

— polishing of the packaging, distribution, and installation process. 

Although at first glance these advances may seem mundane, they are both a 
significant part of the total effort and absolutely indispensable for widely-used, 
lasting software. 

We have recently provided some details of the above in the succinct review 
[20]. Here, we very briefly review requirements for installation and use of GlobSol, 
then focus on present weaknesses in GlobSol and how we are eliminating these 
weaknesses. 



4.2 Statement of the Problem GlobSol Treats 



For reference below, we now formally state the type of problem GlobSol solves. 
The general optimization problem is 

minimize 4>{x) 

subject to Ci(x ) =0, i = 1 , . . . , mi, /,,, 

9i{x) < 0, i = 1, . . . ,?n 2 , 
where </> : R" — > R and Cj, gi : R" — > R. 

The sense in which GlobSol will solve problem (2) is 



Given a box x — ([a^, X\], . . . [x n ,x n ]), find small boxes 
x* = ([x*,xj], . . . [£*,£„]*) such that any solutions of 
minimize 4>(x) 

subject to Ci(x ) = 0, i = 1, ... . , mi, 
gi(x) < 0, i = 1 ,. . . ,m 2 , 
where (f) : R" — > R and c,; , gi : R n — > R 

are guaranteed to be within one of the x* that has been found. 



(3) 
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4.3 Installation and Use of GlobSol 

The main requirements for GlobSol are 

1. a standard-conforming Fortran 90 or Fortran 95 compiler, and 

2. a “make” utility. 

A Fortran compiler is required because the user defines the optimization problem 
as a Fortran program. Even though GlobSol is compiled and linked only once 
(and the user’s program is compiled and linked separately), the same version 
of the same compiler must nonetheless be used for both building GlobSol and 
compiling the user’s input. 

GlobSol can be obtained as a “zip” file from 
http : / /interval . louisiana. edu/GlobSol/ download_globsol . html 
From there, one downloads a compressed file and an “unpack” script appropriate 
to the particular operating system and compiler. The scripts are for compilers 
on various Unix/Linux and Microsoft systems. However, the makefile that builds 
GlobSol has extensive in-line documentation, and can be changed as appropriate 
for new compilers and systems. 

Succinct instructions for installing GlobSol appear in 
http : / /interval . louisiana. edu/GlobSol/ install .html. 

GlobSol has extensive configuration options, accessible by editing a configu- 
ration file. GlobSol is run by supplying a command- line script. A simple example 
is accessible by following the installation instructions. For more details, see [20], 
or examine the various preprints related to GlobSol at 

http : / /interval . louisiana. edu/preprints . html. 

4.4 Improvements to GlobSol in Progress 

GlobSol works relatively well for unconstrained problems, but performs weakly 
when there are many equality constraints. There are several reasons for this. We 
give these reasons, along with present work to overcome these problems, in the 
following paragraphs. 



Obtaining Upper Bounds on the Global Optimum. First, GlobSol is weak 
at finding an upper bound on the global optimum, when constrained optimization 
is used. For unconstrained optimization, GlobSol (and other interval branch 
and bound algorithms) can obtain an upper bound on the global optimum by 
evaluating the objective function at any point x (and using outwardly rounded 
interval arithmetic in the evaluation, for mathematical rigor); the closer x is 
to an actual global optimizing point x* , the sharper the upper bound on the 
global optimum. For constrained problems, there is a complication as outlined 
in [18, §5.2.4]: the interval evaluation needs to be taken over a small box in which 
a feasible point has been proven to lie. However, the same principle holds for 
constrained problems. 
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For unconstrained problems, GlobSol uses a simple steepest descent proce- 
dure followed by the MINPACK-1 routine HYBRJ1 [33] to find a critical point of 
the Fritz- John equations, to increase the chances that x is near a global opti- 
mizer. The MINPACK routines are freely available through NETLIB 
(http://www.netlib.org/), and can thus be distributed with GlobSol. In con- 
trast, until recently, good routines that find approximations x to local optimizers 
of constrained problems have been proprietary, and cannot be distributed with 
GlobSol. Since GlobSol is meant to be self-contained, we have instead provided 
our own routine that employs a generalized-inverse-based Newton method to 
project onto the feasible set [21], As a consequence, in the constrained case, 
GlobSol finds rigorous upper bounds for the global optimizer, but may not find 
a reasonably sharp upper bound until late in the search process. For some ap- 
plications, this is not a problem, but it can have a disastrous effect on efficiency 
in others. 

Recently, Wachter’s quality Fortran code Ipopt for constrained optimization 
(see http://www-124.ibm.com/developerworks/opensource/coin/ and [50]) 
has become available under the Common Public License. (See 
http://www.opensource.org/licenses/cpl.php.) This code should provide 
approximate feasible points x that are highly likely to be near global optimizers, 
thus enabling GlobSol to compute sharp upper bounds on global optimizers in 
the constrained case. We have recently interfaced Ipopt with GlobSol, and we 
are formulating experiments to analyze performance improvements. 



Obtaining Lower Bounds on the Range over Large Regions. A good 
upper bound on the global optimum is generally combined in global search al- 
gorithms with good lower bounds on the range of the objective function over 
subregions x of the search space. If the lower bound of the objective over x is 
larger than the upper bound on the global optimum, then the subregion x can 
be rejected as not containing any global optima. In principle, a simple interval 
evaluation (occasionally replaced by a mean value extension) of the objective 
over x provides the required lower bound. Such a simple interval evaluation is 
what is currently implemented in GlobSol. 

However, since such an evaluation does not take account of the constraints, 
it can have an enormous overestimation. As an example, consider the nonlinear 
minimax problem: 

min max |/j(a;)|, /) : R™ — >■ R, x G R", m>n. (4) 

x l<i<m 



To date, we have had limited success in solving realistic problems of this type 
directly using GlobSol’s non-smooth slope extensions. Alternately, we can con- 
vert the problem to a to a smooth problem with Lemareclral’s technique [29] as 
follows: 



f fi(x)<v\ 

1 < v J ’ 



1 < i < m. 



mm xeR n v 
such that 



( 5 ) 
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In (5), we have introduced a single additional slack variable v, which becomes 
the value of the objective function. If v is treated as an arbitrary additional 
independent variable, then GlobSol presently employs constraint propagation 
to narrow the range of v when a subset of the region for the variables x is 
given. However, this process does not take account of the coupling between the 
constraints, and has not enabled GlobSol to solve minimax problems efficiently. 
Furthermore, interval Newton methods applied to the Lagrange multiplier (or 
Fritz-John) system associated with (5) over large regions have not adequately 
accelerated the search process within GlobSol for realistic minimax problems. 

In contrast, Floudas [10], Salrinidis [47] and their respective groups have 
used convex or linear relaxations of problem (2) to significant advantage in iron- 
verified global optimization software. For example, to obtain a lower bound on 
an objective function over a region x, the objective <f> in problem (2) is replaced 
by a convex (or linear) objective that is known to be less than or equal to 
the actual objective over x. Each left member g,; of the inequality constraints is 
similarly replaced by a convex (or linear) underestimator. Likewise, each equality 
constraint Cj( x) = 0 is replaced by the two inequality constraints c*(: r) < 0 and 
— Cj(x) < 0, and then underestimated. The optimum of the resulting convex (or 
linear) program then is less than or equal to the global optimum of the original 
problem (2). 

Experimenting with Salrinidis’ BARON [44] software, we have been able to 
successfully find global optima of minimax problems of the form (4) . Apparently, 
the reasons these techniques are successful where the others are not are because 

1. they take account of the coupling between the constraints, and 

2. the resulting relaxations (i.e. the derived simpler problems) have solutions, 
and these solutions are easy to obtain. 

The computations with convex linear underestimators can be made rigorous 
with the following procedure: 

1. Compute a relaxed (simplified) convex or linear problem over x. 

2. Compute the solution to the convex or linear problem with a floating-point 
solver that gives an approximate solution x. 

3. Use x with the validation technique in [13] to provide a rigorous lower bound 
on the solution to the relaxed (and hence on the solution to the original) 
problem over x. 

We are presently experimenting with this procedure, and will eventually incor- 
porate it into GlobSol for minimax procedures. 

Although each group develops different techniques, both Floudas [10] and 
Salrinidis [47] develop methods by which these underestimators can be computed 
automatically (with automatic-differentiation-like technology; see [10] and [47]); 
such techniques (and others) could eventually be incorporated into GlobSol. 



Efficiency of GlobSol’s Automatic Differentiation, List Processing, etc. 

As outlined in [18, §1.4 and §2.2], GlobSol interprets an internal representation of 
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the objective and constraints, termed a “code list”, to compute point and interval 
values of the objective, constraint residuals, Jacobi and Hessian matrices, etc. 
This internal representation was designed with simplicity in mind, under the 
assumption that problems GlobSol would solve are relatively small and would 
not be limited by inefficiencies in function evaluation. However, for a number of 
problems, evaluation of the code list could speed computation. 

Experiments by Corliss et al. under the Sun project have indicated that, 
for some problems, converting the code list to Fortran code then compiling it 
gave a noticeable performance improvement, but did not make a difference in 
the practicality of solving particular problems. On the other hand, operations 
for evaluating every constraint and the objective are included in a single code 
list, and all of these operations are performed whenever a particular objective or 
constraint value is needed at a new point (or interval) of evaluation. Separating 
the operations could benefit particular problems. 

Another area of possible efficiency gains in GlobSol is in its list processing. 
In the global search, regions x are repeatedly bisected into and x^; x ^ 
is processed further, while x G) is stored in a linked list structure. Memory is 
allocated whenever a box is stored on the list, and is freed whenever a box 
is removed. For some problems, a more sophisticated allocation / deallocation 
scheme would greatly improve performance. 

Although, with time, we intend to implement these GlobSol improvements, 
we do not place them at as high a priority as algorithmic improvements, such as 
use of convex underestimators. In our view, fundamental algorithmic improve- 
ments will advance both the practicality of GlobSol and the fundamental state 
of the art in verified global optimization more. 

4.5 Simplification of GlobSol 

At present, there are many optional algorithm paths in GlobSol, some of which 
are not used. This is a result of the original research nature of GlobSol. Eventu- 
ally, some of these paths (along with supporting code) can be eliminated. 

Other improvements in this general category include updating GlobSol’s in- 
stallation scripts. 

4.6 Summary 

In this section, we have described GlobSol, validated global optimization soft- 
ware for Fortran. GlobSol represents a little over a decade of work on algorithms 
and implementations. GlobSol is unusual among such packages in being openly 
available and self-contained. Although GlobSol has weaknesses for certain kinds 
of constrained problems, we are excited about alternate algorithms, as yet un- 
tried in a validated context, that promise to remove many of these weaknesses. 
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5 ACETAF (Markus Neher) 

5.1 Introduction 

The software package ACETAF has been developed by Ingo Eble and Markus 
Neher. It is a C++ program for the accurate computation of error bounds for 
Taylor coefficients of analytic functions. ACETAF originated from a subroutine 
in a program for the validated solutions of ODEs [34] and has evolved over three 
years to its present state, which includes additional features besides the compu- 
tation of bounds for Taylor coefficients. For a user-defined complex function /, 
the following problems are solved with ACETAF. (We list the problems in the 
order in which they rely on each other). 

- Rigorous computation of leading Taylor coefficients. 

- Check of analyticity in a user-defined disc. 

- Rigorous computation of bounds for Taylor coefficients with arbitrary order. 

- Rigorous computation of bounds for Taylor remainder series. 

In section 5.2, we report on the scope of ACETAF and the mathematical 
backgrounds for the problems solved by ACETAF. Section 5.3 deals with the 
availability of ACETAF. In the last section, we present a numerical example. 



5.2 Range of Use of ACETAF 

Admissible Functions. For all features of the program, the user may enter an 
expression for a function / that must belong to the following set of admissible 
functions: 

- Polynomials and rational functions, 

- the exponential function, the sine, the cosine and the tangent function, 

- the principal branch of the logarithm, 

- the principal branches of the square root and of other roots with rational or 
(floating-point) real exponents, 

- the principal branches of the inverse trigonometric functions, and 

finite compositions of these functions, such as exp(z 2 ) or tanh(ln(z 2 + l)/3). 

Loops and branches are not allowed in the expression for /. For roots, log- 
arithms, or inverse functions, principal branches are always assumed by the 
program. For example, In 3 is interpreted as the principal branch In \z\ +*Arg 3 
of the logarithm (with Arg 3 £ (— 7r,7r)). As a special consequence, In 3 is not 
defined if 3 is a negative real number. 

Furthermore, the underlying mathematical theory of the algorithms in 
ACETAF requires that / be analytic in a user-defined disc in the complex plane. 
On request of the user, the program checks whether the user-defined function / 
is analytic on the given disc. 
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Rigorous Computation of Values and Ranges of Functions. The algo- 
rithms that are employed in ACETAF rely on function values and ranges of 
functions. The validated determination of these is accomplished with interval 
computations [1,14,32,38]. Floating-point interval arithmetic [26,28] is used in 
the practical calculations to handle all roundoff errors. We assume that the 
reader is familiar with interval computations. We only introduce some notation, 
and we recall the definition of an inclusion function. 

The range of a function f on a domain D is denoted by f(D), i.e. f(D) := 
{ f{z) | z € D}. An inclusion function F of a given function f on D C C is an 
interval function (an expression that can be evaluated according to the rules of 
interval arithmetic) that encloses the range of / on all intervals z C D: 

F(z ) D f(z) for all z C D. 

Real and complex floating-point interval arithmetic has been implemented in 
a number of programming languages and libraries, such as C-XSC, [24], filib-l — I- 
[30,31], or INTLAB [43]. ACETAF runs with both the C-XSC or the filib-l — f 
interval library. Since neither of these libraries includes routines for complex 
standard functions, the complex standard functions library CoStLy [9] has also 
been included in ACETAF. 



Computation of Complex Taylor Coefficients. ACETAF offers the com- 
putation of some leading Taylor coefficients of a user-defined function. These 
Taylor coefficients are computed via automatic differentiation [12,42], 

The complex Taylor arithmetic is based on a well-known property of admissi- 
ble functions: their real and imaginary parts can be expressed as compositions of 
real standard functions. For example, if z = x + iy, then e z = e x cos y + ie x cos 2 . 
Braune and Kramer [2] used such decompositions for constructing inclusion func- 
tions; in ACETAF they are used for computing complex derivatives from real 
derivatives. 

In general, if f(z) = u(x, y)+iv(x, y) then we have f'(z) = u x (x , y)+iv x (x , y). 
Similarly, specific Taylor coefficients of / are calculated by applying the well 
known formulas of automatic differentiation to the real and the imaginary parts 
of /, respectively. 



Check of Analyticity. The error bounds on the Taylor coefficients that will 
be presented in the next subsection require that / be analytic on the disc B. 
Multi-valued analytic standard functions are all interpreted as being principal 
values with strict domain restrictions. 

To detect violations of the analyticity of a user-defined function / on a given 
disc, the analyticity of / can be checked before computation of the bounds. If 
the proof of analyticity fails on the user-defined disc, then ACETAF computes a 
validated lower bound of the maximum radius to the given midpoint, such that 
/ is analytic on the full disc. This is done by a heuristic algorithm which uses 
bisection of the radius of the given disc. 
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Because the regions of analyticity are hard to detect for composite func- 
tions, the analyticity check is always recommended before the computation of 
the bounds for the Taylor coefficients. 



Bounds for Taylor Coefficients with Arbitrary Order. The rigorous com- 
putation of bounds for Taylor coefficients with arbitrary order is the main feature 
of ACETAF. Such bounds are used for error analyses in numerical computations. 
For example, they are used in the well-known Taylor series method for the so- 
lution of ODEs [35]. Geometric series bounds for Taylor coefficients of analytic 
functions are also used for finding multiple zeros or clusters of zeros. In [45], the 
availability of such bounds is assumed, but no method for their computation is 
mentioned. 

In ACETAF, four methods for calculating such bounds are implemented. 
Method I is Cauchy’s estimate(6). For a function 

OO 

f( z ) = '%2 a 3 z3 ’ \z\<r 

j = o 

that is analytic on a disc B := {z : \z\ < r} with positive radius r and bounded 
on the circle C := {z : \z\ = r}, it holds that 

Kl < j G No, (6) 



where M(r ) := max \f(z)\ . 

Id =r 

The calculation of M(r) poses a simple global optimization problem (cf. 
section 4 of this paper). In ACETAF, the following branch and bound algorithm 
is employed to compute a validated upper bound for Cauchy’s estimate for an 
analytic function / and a given circle C with radius r: 

1. For some fc max G N, C is split into segments Sk, k = 1, . . . , fc max , which are 
gathered in a list L. 

2. Each segment is covered by a rectangular complex interval z With an 
inclusion function F of /, a set w = [wk,Wk] 2 |/| (zk) is computed. 

3. M := maxuljfc and M := maxuifc are guaranteed upper and lower bounds for 

k k 

M{r), respectively. 

4. If M — M_ is sufficiently small, then the algorithm is terminated. Otherwise, 
elements Sk that cannot contain a maximum of |/| are eliminated from the 
list L. The remaining segments are bisected and gathered in a new list L. 
The algorithm is then continued with step 2. 

The three other methods that are implemented in ACETAF are variants of 
Cauchy’s estimate, which have been developed in [36]. In method II, Cauchy’s 
estimate is applied to the defect of some Taylor polynomial approximation of /; 
in method III, Cauchy’s estimate is applied to some derivative of /. The most 
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general method IV is a generalization of the other three methods. Instead of 
M(r), the number 



V (r, m, l) := max |/ (rra) (z) - s/(^)| 

M =r 



is used in the estimation of the Taylor coefficients of /, where to and l are 
integers, is the mth derivative of /, and s; is the Zth Taylor polynomial 
(expanded at the origin) of /("d. Instead of (6), we obtain [36,37] 



h\ < 



(j — to)! V (r, ?n, l) 

j\ r j-m 



for j > m + l. 



( 7 ) 



Letting s_i = 0, the four methods correspond to the following choices of in 
and l : 



- Method I (Cauchy’s estimate) is obtained for to = 0, l = — 1, 

- method II consists of the choice to = 0, l > Oj 

- method III consists of the choice to > 0, l — —1, 

- method IV uses to > 0, l > 0. 

For to > 0 in (7), the remainder series of / is bounded by a series that 
converges faster than any geometric series, for all z £ B. Thus, the estimate (7) 
is a considerable improvement over Cauchy’s estimate. 

Extensive numerical testing has shown that the above optimization algorithm 
with recursive splittings is not optimal, neither with respect to the accuracy 
of the computed bound V(r,m,l), nor with respect to the computation time. 
As an alternative to adaptive bisection, the user of the program can invoke a 
fixed uniform partitioning of C with some /c max segments Sk ■ Based on user- 
defined values for m and fc max , the program determines the order l of the Taylor 
polynomial such that l is sufficiently large for a good approximation of / by t[, 
but reasonably small with respect to the overall computation time [8] . 



Bounds for Taylor Remainder Series. In addition to bounds for the Tay- 
lor coefficients of /, ACETAF also computes bounds for the Taylor remainder 
series R p (z ) := a j z * °f f \ f° r some z with \z\ < r. Bounds for R p are 

obtained from summing up the respective estimates for the Taylor coefficients in 
the remainder series. For the methods I and II, the remainder series is estimated 
by a geometric series. For example, in method I we obtain the estimate 



R p {z) < 



M{r) ( ^ ) 

l N_ 



A closed expression for the majorizing remainder series in methods III and 
IV is given in [37]. 
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5.3 Availability of ACETAF 

Our program is available in two versions, depending on the interval library that is 
used: C-XSC [24] or filib++ [30,31]. The C-XSC library is more comprehensive 
than filib++, but the latter is much faster than the former. The libraries C-XSC 
and filib++ are distributed under the terms of the GNU Lesser General Public 
License (formerly called GNU Library General Public License) [11]. 

ACETAF is distributed under the terms of the GNU General Public License 
[11]. The software is currently available at the following sites: 

C-XSC and filib++: http://www.xsc.de and 

ACETAF: http : //www.uni-karlsruhe . de/ 'Markus .Neher/acetaf .html 

At the moment, C-XSC supports the following platforms: 

GNU C++ compilers gcc 2.95.2 or higher, PC with Linux, 

GNU C++ compilers gcc 2.95.2 or higher, Sun Solaris workstation. 

filib++ requires one of the GNU C++ compilers gcc 2.95.2 or higher, or the 
KAI C++ compiler. The filib++ macro library (which is used by ACETAF) is 
only supported on x86 systems and requires the use of GNU make. 

ACETAF has been extensively tested and has been found to be reliable and 
robust. Of course, even though it is software for validated computations, it is 
subject to the same possible errors as conventional software. The program is 
distributed in the hope that it will be useful, but without any warranty. 



Graphical User Interface. All input data (such as the order l of the Taylor 
polynomial in the computation of V(r,m,l), the maximal number of intervals 
in the list of the branch and bound algorithm, etc.) can be entered via a self- 
explanatory graphical user interface. The values are stored in an output file of 
the computation, and this file can be reused in other calculations. 

The user of the program can enter four parameter values, which control the 
termination of the branch and bound algorithm: 

- f max , the maximum computation time; 

- £abs) the tolerated absolute error of the interval enclosure for the respective 
bound of each method; 

- £ re i, the tolerated relative error; and 

- fc max , the maximum number of subintervals. 

The computation is terminated when at least one of these termination criteria 
is fulfilled. 



Symbolic Expression Handler. ACETAF includes a symbolic expression 
handler, so that arbitrary user-defined compositions of the supported library 
functions can be used. The functions may be defined on arbitrary discs in the 
complex plane. Functions are entered as strings in the usual mathematical no- 
tation. The independent complex variable is represented by the literal “z” . The 
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literal “i” is used for complex unity. A function expression may contain constants 
in the scientific number format (such as 1.234E-05), the arithmetic operators +, 
*, /, the functions sqr, sqrt, exp, In, sin, cos, tan, cot, asin, acos, atan, acot, 
sinh, cosh, tanh, coth, asinh, acoth, atanh, acoth, and the following functions 
with two arguments: power (integer powers), pow (real powers), and root (integer 
roots). 

5.4 Numerical Example 

Numerical examples were presented in [8,35,37]. Here, we only give one example 
for illustration. We show a table of upper bounds for M (r) and V (r, to, l) for 
different choices of in and l, for several radii. The termination parameters are 
set to £ re i =0.1, e a bs = 0, and f max = 3600 seconds (to avoid abortion of the 
program based on excess time). We used fc max = 1024 for the computation of 
M(r) and fc max = 8192 for the computation of V(r,m,l). 

The table includes bounds for some of the Taylor coefficients and for some 
remainder sums of the respective functions, that were computed with ACETAF 
2.8 and the filibH — I- interval library. The computation times (in seconds) were 
obtained on a PC with a 1200 MHz Athlon processor. 



Table 3. Bounds for Taylor coefficients of f(z) = (cos z)/(z 2 + 101). 
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Example: Bounds for Taylor Coefficients of f(z) = (cos z) / (z 2 + 101). 

/ has a singularity at z = \/101 *, and the circle with radius 10 is very close to 




Libraries, Tools, and Interactive Systems for Verified Computations 



61 



this point. Nevertheless, the computation of M and V is feasible, but the bounds 
obtained for XV (10, — 1 ,m) are rapidly increasing with in. 

For small radii, both methods II and III improve the bounds for the Taylor 
coefficients aj and for the remainder series R p by several orders of magnitude 
compared to the bounds that result from Cauchy’s estimate. 

6 Summary 

We have presented four different software tools for verified computations. It 
appears that these packages are as diverse as the applications for which they 
were developed. 

Each of the four packages is written so the user can define a particular prob- 
lem in the same way it would be defined for non-rigorous software for the same 
purpose. The careful bounding of truncation errors is hidden in the code, where 
an unexperienced user may not even spot the difference. This is also true for the 
treatment of roundoff errors, which some of the packages in this paper do not 
handle themselves, but employ other well known interval libraries for this task. 
Hence, even a user knowing nothing about interval arithmetic or roundoff errors 
can use the software in the same way as conventional software. 

As the development of computers continues, the question of computation 
times will become less important. Whether a computer program needs only five 
milliseconds or a full second to solve a particular problem is often irrelevant. 
Hence, if rigorous software is as simple to use as non-validated software, more 
users will be willing to use it to get validated results for their problems, even if 
the computation may take longer. 
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Abstract. We give a survey on packages for multiple precision inter- 
val arithmetic, with the main focus on three specific packages. One is 
a Maple package, intpakX, and two are C/C-l — |- libraries, GMP-XSC 
and MPFI. We discuss their different features, present timing results 
and show several applications from various fields, where high precision 
intervals are fundamental. 



1 Why Develop Multiple Precision Interval Packages? 

1.1 Need for Arbitrary Precision Interval Arithmetic 

Multiple precision is a floating-point arithmetic, where the number of digits of 
the mantissa can be any fixed or variable value. It is usually applied to problems 
where it is important to have a high accuracy (e.g., many digits of n). However, 
for algorithms where extra computing precision is required (these are mostly 
numerical algorithms) it is important to distinguish between predictable and 
unpredictable loss of accuracy. If this loss is predictable, then multiple precision 
arithmetic perfectly fulfils the application’s needs. When it is unpredictable, 
interval arithmetic can prove useful to bound this loss of accuracy. Of course, 
this interval arithmetic must also be based on a multiple precision arithmetic. 
Hence, we are particularly interested in 

numerical problems, with a large and unpredictable loss of ac- 
curacy. 
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Although multiple precision interval arithmetic might help, one should be 
aware of the fact that this often means an increase in the computational time 
and memory usage, cf. Section 3. 

The literature is inconsistent about the exact meaning of the term multiple 
precision. Sometimes multiple precision refers only to extended and fixed pre- 
cision, whereas arbitrary precision is used for variable precision. In this paper, 
multiple precision refers to extended precision, whether it is variable or not. Ar- 
bitrary precision arithmetic offers the possibility to set precision to an arbitrary 
value as needed in the computations; this can be done either statically or dy- 
namically, i.e. during the computations. Interval packages based on GMP (GNU 
multiple precision) arithmetic or Maple arithmetic are such. But there are also 
approaches offering multiple precision arithmetic without the possibility to vary 
the precision, for example the staggered multiple precision arithmetic in the XSC 
(extended Scientific Computing) languages [25,26]. 

1.2 Organization of the Paper 

The motivations and needs for multiple precision interval arithmetic packages 
are discussed in this first part. The second part consists of a survey of various 
packages, and in particular the packages developed by the authors are presented: 
intpakX for Maple, MPFI in C and GMP-XSC in C++. In the third part, a com- 
parison in terms of performance is conducted. In the last part, various applica- 
tions are presented: interval Newton, range enclosure, linear algebra, quadrature, 
application to mathematical finance, global optimization. 

1.3 Interval Arithmetic in Software Packages for Scientific 
Computing 

The reasons for the implementation of an interval package for scientific comput- 
ing software, such as MatLab, Maple or Mathematica, are different from those 
motivating interval libraries for standard programming languages like C++ (see 
Section 1.4). 

These software environments are powerful tools for various kinds of com- 
putations, but, in contrast to programming languages, they primarily aim at 
usability, convenience and visualization of data. Moreover, they serve as means 
of education in schools and universities. 

In addition to the general reasons for the implementation of an interval pack- 
age, these packages serve the following purposes: 

— combine symbolic computation with interval evaluation for computer algebra 
systems (Maple or Mathematica). 

— check results computed by this software or results from different environ- 
ments by graphically displaying them; 

— learn or teach interval arithmetic; 

— use interval arithmetic without the need of being fully familiar with the 
concepts of a programming language. 
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One further reason especially applies to environments offering symbolic com- 
putation and multiple precision at the same time: 

— In a computer algebra environment, the inexperienced user is apt to mistake 
rounded results for exact results, since symbolic computations are free of 
round-off errors, and he might expect that this will hold for the rest of his 
computations as well. 

The combination of multiple precision and interval arithmetic is a way to 
fulfil this expectation. 

Moreover, arbitrary precision is a much more natural way to deal with num- 
bers than the standardized floating-point arithmetic. This point has to be partic- 
ularly mentioned regarding the fact that an environment like Maple (especially 
with a GUI) serves teaching purposes. 

1.4 Libraries for Arbitrary Precision Interval Arithmetic: 

Efficiency Issues 

Other considerations apply to the implementation of multiple precision interval 
arithmetic libraries for programming languages. Here, the main issue is efficiency 
rather than ease of use and suitability for educational purposes. Indeed, the in- 
tended user is expected to be already familiar with a programming language and 
willing to incorporate interval computations into his/her programs. However, few 
programming languages or compilers have native interval datatypes and oper- 
ations (cf. Section 2.2). Thus, to allow interval computations in environments 
that do not support intervals, the solution consists in developing libraries. 

Libraries developed for an existing programming language are compiled, i.e. 
interval operations are executed faster than within an interpreted package, which 
was detailed in the previous section. Furthermore, the memory management is 
tailor-made by the programmer of the library, which implies that this memory 
management can be made more efficient than a general one, since it is ded- 
icated to a specific kind of application. A last source of efficiency lies in the 
use of the processor’s arithmetic unit: with XSC (extended Scientific Comput- 
ing) languages (cf. Section 2.2), operations are based on floating-point ones; with 
GMP-based (GNU Multiple Precision) libraries (cf. Section 2.3 and Section 2.4), 
they are based on machine integers. By contrast, in Maple all computations are 
done with radix-10 digits and all operations are thus software ones. 

However, the programming of a multiple precision interval arithmetic library 
does not necessarily involve a tremendous amount of work: efficient libraries for 
multiple precision floating-point arithmetic can be used as a basis; much of the 
work is then already done, in particular memory management issues may already 
be handled, for instance it is performed by GMP. 

Finally, if the chosen programming language offers operator overloading - as 
most object-oriented languages do - then modification of existing applications is 
very easy: indeed, only data types have to be changed. This feature is common 
to most packages developed for scientific computing software environments as 
well as libraries developed in C++ for instance (cf. Section 2.3 and Section 2.4). 
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2 Survey of Various Implementations 

2.1 Packages for Scientific Computing Software Environments 
IntLab for MatLab 

IntLab [43,44] is an interval arithmetic package for MatLab. The main objective 
of its author, S. Rump, is to compute verified results with similar capabilities 
as MatLab in terms of ease of use and of execution time. Thus, a clever way 
to perform interval matrix operations has been developed, which takes benefit 
of MatLab highly optimized routines. Procedures have been developed for auto- 
matic differentiation and for reliable solving of linear and nonlinear systems of 
equations. Since standard functions are not reliable in MatLab, S. Rump has also 
implemented guaranteed standard functions; a critical point is reliable and accu- 
rate argument reduction, and to implement it, so-called ’’long ” arithmetic has 
been developed. Up to version 4.1.1, the procedures which have been developed 
are mainly the ones required for argument reduction: arithmetic operations, the 
7 r constant and the exponential function. This long arithmetic is ’’rudimentary, 
slow but correct” according to its author. Few standard functions are available 
and matrices with long components are not yet possible. 



Package for Mathematica 

Interval is a datatype in Mathematica. J. Keiper [24] justifies its introduction 
with arguments similar to the ones given in Section 1.3: education of a large num- 
ber of potential users to interval arithmetic, ease of use, graphical possibilities 
and some examples to demonstrate the power of this arithmetic. 

Since Mathematica offers high precision floating-point arithmetic, it was quite 
natural that intervals can have as endpoints exact numbers or floating-point 
numbers with arbitrary precision. However, J. Keiper warns against two un- 
pleasant phenomena with Mathematica intervals. The first one is that outward 
rounding is done by the software, since setting rounding modes at a low level 
is non portable; this implies some excess in the width of computed intervals 
and leads for instance to a width of 4.44089 x 10 -16 for the following inter- 
val: Interval [1.] with Mathematica version 4.2, even with 1.0 being exactly 
representable, i.e. the width should be 0. 

The second unpleasant phenomenon is illustrated by the following sequence 
(in Mathematica version 4.2): 

In[l] := e=15-39Sin [EulerGamma] -2Pi ; 

In [2] : = N [Interval [{e , e}] , 16] 

Out [2] = Interval [{-12. 5652, -12.5652}] 



In[3]:= N [Interval [{e , e}] , 17] 

Out [3] = Interval [{-12. 565205412135305, -12.565205412135305}] 
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i.e. the intersection of the two resulting intervals, each of which should contain 
the exact value, is empty. One possible explanation can be found in [24]: Also, an 
assumption is made that is known to be false: library functions for the elementary 
functions are assumed to be correct to within one ulp and directed rounding by 
one ulp is used to “ ensure ” that the resulting interval contains the image of the 
argument. There are no known examples for which the elementary functions are 
in error by more than an ulp for high-precision arithmetic. The wrong previous 
computation can also be attributed to unvalidated conversion from real to inter- 
val and to unvalidated binary-to-decimal conversion in input/output routines. 

In Mathematica, LU-related procedures and nonlinear system solvers can 
have intervals as arguments and return guaranteed results. Some extensions or 
applications based on this package are to be found in [7] and [33]. 



intpakX for Maple 

intpakX is a Maple package for interval arithmetic. It contains data types, basic 
arithmetic and standard functions for real interval arithmetic and complex 
disc arithmetic. Moreover, it implements a handful of algorithms for validated 
numerical computing and graphical output functions for the visualization 
of results. The package intpakX thus gives the user the opportunity to do 
validated computing with a Computer Algebra System. 

One motivation for the implementation of intpakX was to offer some algo- 
rithms and extended operations using the existing intpak framework [11] which 
used to be part of the now discontinued Maple Share Library. At the same 
time, the visualization of these interval applications should be possible, also as 
a means to easily confirm the computed data. Examples of this can be found in 
[15]; here, we simply give three examples of the enhanced or more convenient 
graphical output possibilities (see illustration). 




Fig. 1 . Example output for the range enclosure of / := x — > exp(— x 2 ) ■ sin(nx 3 ) (left), 
g := ( x,y ) — » exp(— xy) ■ sin(nx 2 y 2 ) (center), and a complex polynomial with three 
different enclosures (right). 
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The other specific motivation was the fact that intervals can be defined in 
Maple without using intpakX, but that the evaluation of interval expressions 
does not behave according to all expected mathematical properties. Proper 
rounding is not provided (see below) and there are a number of other effects 
(like the simplification of terms prior to their evaluation, e.g. simplification of 
[1,2] — [1,2] into 0). Facing this, there was a need for an interval arithmetic 
which would offer the expected mathematical properties and correct operators. 



History and Implementation. The first intpak version was created in 1993 
by R. Corless and A. Connell [11] as an effort to incorporate real intervals into 
Maple. In 1999, intpakX was released by I. Geulig and W. Kramer [15,16] as 
an extension to intpak incorporating important changes as well as a range of 
applications and an additional part for complex numbers. The current release 
intpakX vl . 0 (June 2002) is a redesigned package combining the formerly sepa- 
rate packages in one new version. In December 2002, it was released by Waterloo 
Maple as Maple Power Tool Interval Arithmetic [1]. The package is implemented 
as a Maple module (a feature Maple offers since version 6). 

The most important feature of the package is the introduction of new data 
types into Maple for 

— real intervals and 

— complex disc intervals. 

A range of operators and applications for these data types (see below) have 
been implemented separately (with names differing from the standard operators’ 
names), so that the new interval types do not rely on the (rough) notion of an 
interval Maple already has. So, intpakX intervals can be used safely with the 
implemented operators. 

Also, rounding is done separately, since there are examples where the round- 
ing included in Maple is not done correctly. Namely, the expression x — e (x > 0 
a Maple floating-point number with n decimal digits, e < 10 - ™) yields x when 
Rounding is set to 0 or — oo, although it should yield the largest n-digit number 
smaller than x. As needed in interval arithmetic, rounding is done outwardly in 
computations with intpakX. 

intpakX functions, though being separately implemented, use standard 
Maple operators and functions (intpakX interval sin uses the Maple sin im- 
plementation for example). Thus, errors in Maple arithmetic being greater than 
1 ulp will affect intpakX results. 

The graphical functions included in intpakX make it easier to use Maple 
graphics in conjunction with interval computations. They use Maple graphics 
features to offer special output for the visualization of the intervals resulting 
from the concerned intpakX functions. 



Scope of implemented functions and applications. As mentioned above, 
intpakX defines Maple types for real intervals and complex disc intervals. 
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Here is a survey of the operators, functions and algorithms that intpakX 
includes. First, functions and operators for real intervals are given followed by 
the incorporated numerical algorithms. After that, the functions for complex 
intervals are specified. 

— On the level of basic operations, intpakX includes the four basic arithmetic 

operators denoted as &+, &*, &/. It also includes extended interval 

division as an extra function. 

— Furthermore power, square, square root, logarithm and exponential functions 
(note that square is implemented separately from general multiplication as 
needed for intervals) as well as union and intersection are provided. 

— A set of standard functions has been implemented (sin, cos, tan as well as 
their inverse and hyperbolic versions). 

— Reimplementations of the Maple construction, conversion and unapplication 
functions are added. 

The following numerical algorithms are implemented to work with the fore- 
going functions (for short examples, see [17]): 

— verified computation of zeros (Interval Newton Method) with the possibility 
to find enclosures of all zeros of a function on a specified (adequately small) 
interval; a branch and bound technique is used to display the resulting in- 
tervals in each step. 

— range enclosure for real- valued functions of one or two variables, which uses 
either interval evaluation or evaluation via the mean value form and adaptive 
subdivision of intervals. 

Using the above algorithms, the user can choose between a non-graphical and a 
graphical version displaying the resulting intervals of each iteration step. 

Like for real intervals, there is a range of operators for complex disc arith- 
metic: 

— in addition to the basic arithmetic operators, there are area-optimal multi- 
plication and division as an alternative to carry out these operations; 

— as a further function, the complex exponential function has been imple- 
mented. Let us denote by Z := (c, r) the complex disc centered at c with 
radius r, with c a complex number and r a nonnegative real number. Interval 
operations are used to compute the complex disc 



exp((c,r)) := (exp(c), max^ & [ 0 ,, 2 n) \exp(c + r(cos(<£) + *sin(<£))) - exp(c)|) 
= (e c , |e°| (e r — 1)) (1) 

with e c = e Cl (cos(c 2 ) + isin(c 2 )) (for c = C\ + ic2) (this is discussed more 
detailedly in [15]). The upper bound of the resulting interval for the radius 
is used as the radius of the new disc while the new center is defined by the 
midpoint of e c (interpreted as a rectangular complex interval). Formula (1) 
uses the fact that the maximum value of \exp(z) — exp(c) |, z £ Z, is reached 
for 2 e dZ (see, e.g., [14]). 
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Range enclosure for complex polynomials serves as an application for com- 
plex interval arithmetic. Three different versions are implemented: the first and 
second use a Horner scheme with centered and area-optimal multiplication, re- 
spectively, the third one uses a centered form. 



2.2 Languages and Libraries 

Few languages and compilers include a support for interval arithmetic; let us 
quote the XSC languages [3] (C/C++ [25], Pascal [26]) and the Sun Forte com- 
pilers for Fortran and C/C++ [47]. However, times are changing and for instance 
the introduction of interval arithmetic in the BLAS library is being discussed 
(cf. http : ///www.netlib . org/blas/blast- forum/). 



XSC (extended Scientific Computing) Languages 

Multiple precision interval arithmetic is even more rare. Besides interval arith- 
metic, the XSC languages offer a “staggered” arithmetic, which is a multiple, 
fixed, precision. The chosen precision enables the exact computation of the dot 
product of two vectors of reasonable size with “double” floating-point compo- 
nents. This multiple precision type can be used for floating-point and interval 
values, it is called “dotprecision” , and the corresponding arithmetic “staggered”. 
This type of multiple-precision numbers consists of a vector {x \ , ..., x n ) of double 
precision numbers whose sum yields the represented number x = JT Xi- Such 
vectors can contain up to 39 entries. Indeed, it is limited to the dot product of 
double precision vectors, whose range of exponents is { — 1022, - - - ,1023}, plus 
extra positions to take into account the vectors’ length. 

The details of this type of multiple precision arithmetic and its implementa- 
tion can be found in [25] or [29]. Apart from computing accurate dot product, it 
has also been used for Horner evaluation of a polynomial in the interval Newton 
algorithm [28]. 



The Range Arithmetic 

Other works are libraries rather than languages or compilers, they are devel- 
oped in a given programming language. For instance, the “range” library has 
been developed by Aberth et al. as early as 1992 [4]: C++ has been chosen 
for its operator overloading facility and the library is thus easy to use; indeed, 
formulas involving “range” operands can be written exactly as formulas with 
usual floating-point operands. It has to be mentioned that the C++ language 
has evolved and the “range” library is now difficult to compile because its C++ 
is too old for most compilers. The “range” type is an arbitrary precision floating- 
point type coupled with a “range” , which controls the accuracy of the represented 
number: only relevant digits are stored, these digits being more relevant than the 
range which can be seen as an absolute error. For instance, when a cancellation 
occurs, the result has a small number of digits. 
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Abertlr has developed numerical algorithms using this automatic accuracy 
control and presented them in [5]. This range arithmetic can be seen as a form 
of interval arithmetic, as long as no large intervals are used, since they cannot 
be represented as range objects: the range has to be smaller (in absolute value) 
than the corresponding number. 



Brent’s MP, Augment, and a Multiple Precision Interval Package by 
Yohe 

The oldest library implementing multiple precision interval arithmetic may well 
be the one developed in Fortran by Yohe in 1980 [49]. It is based on the one hand 
on the Augment preprocessor, which replaced arithmetic operators by calls to 
the appropriate functions, as operator overloading was not available, and on the 
other hand on Brent’s MP package for multiple precision floating-point arith- 
metic [10]. However, Brent himself recommends to use a more recent package 
than MP: ”MP is now obsolescent. Very few changes to the code or documen- 
tation have been made since 1981! [...] In general, we recommend the use of a 
more modern package, for example David Bayley’s MPP package or MPFR” (cf. 
http://web.comlab.ox.ac.uk/oucl/work/richard.brent/pub/pub043.html). 



Other Works 

The two packages which will be introduced now are based either on MPFR, 
following Brent’s recommendation: the MPFI package, or on the floating-point 
type of the GMP package [2]: the GMP-XSC package. MPFI is presented first 
because it contains more ’’basic” functionalities, whereas GMP-XSC provides 
more elaborated things such as special functions. 

2.3 MPFI 

In order to implement an arbitrary precision interval arithmetic, a multiple preci- 
sion floating-point library was needed. MPFR ( Multiple Precision Floating-point 
Reliable arithmetic library ) was chosen because it is a library for arbitrary preci- 
sion floating-point arithmetic that is compliant with the IEEE-754 standard [20] 
and even more. It provides exact outward rounding facility for the arithmetic 
and algebraic operations, for conversions between different data types and also 
for the standard functions. Furthermore, it is portable and efficient: MPFR is 
based on GMP and efficiency is a motto for its developers, and the source code 
is available. MPFR is developed by the Spaces team, INRIA, France [13]. 

The MPFI library implements interval arithmetic on top of MPFR. MPFI 
stands for Multiple Precision Floating-point Interval arithmetic library, it is a 
portable library written in C and its source code and documentation can be 
freely downloaded [39]. 

Intervals are implemented using their endpoints, which are MPFR floating- 
point numbers. The specifications used for the implementation are based on the 
IEEE-754 standard: 




Multiple Precision Interval Packages: Comparing Different Approaches 



73 



— an interval is a connected closed subset of M; 

— if op is an n-ary operation and Xi,... . x n are intervals, the result of 
op( Xi , . . . , Xn), the operation op performed with interval arguments, is an 
interval such that: {op(x\, . . . , x n ),Xi € Xj} C op(x i, . . . , x n ); 

— furthermore, op(x i, . . . , x n ) or /(xi, . . . , x n ), where / is an elementary func- 
tion, returns the tightest enclosing interval with floating-point endpoints; 

— in case op(x 1 ,... ,x n ) is not defined, then a NaN (“Not a Number”, which 
stands for an invalid operation) is generated, i.e. the intersection with the 
domain of op is not taken prior to the operation; 

— each endpoint carries its own precision (set at initialization or modified dur- 
ing the computations). 

The arithmetic operations are implemented and all functions provided by 
MPFR are included as well (trigonometric and hyperbolic trigonometric func- 
tions and their inverses). Conversions to and from usual and GMP data types 
are available as well as rudimentary input/output functions. The code is written 
according to GMP standards (functions and arguments names, memory man- 
agement). 

The largest achievable computing precision is determined by MPFR and 
depends in practice on the computer memory. The only theoretical limitation 
(which will be removed in future versions) is that the exponent must fit in a 
machine integer. It suffices to say that it is possible to compute with numbers of 
several millions of binary digits if needed. The computing precision is dynami- 
cally adjustable in response to the accuracy needed. 

2.4 GMP-XSC 

GMP-XSC was intended as a fast multiple precision package that might supple- 
ment the well-known package C-XSC. The name indicates that it is also based 
on the GNU multiple precision subroutines. The need for GMP-XSC came from 
Application 4.5 described below. The problem was to evaluate an integral over 
the real half axis. The integrand is oscillatory and thus, the cancellations are 
huge. This calls for a high precision arithmetic. Furthermore, the integrand con- 
tains special functions. One of them as well as elementary functions had to be 
evaluated in the complex plane. Finally, huge high order derivatives had to be 
estimated on intervals by using interval arithmetic. Multiple precision is not 
necessary but we need an arithmetic that deals with large exponents. 

GMP-XSC contains all features that are necessary to solve the problem that 
was just described briefly and that will be described in more details below. It has 
some extra functions and its completion will go on. GMP-XSC is essentially a 
C-| — I — wrapper for the C-program GMP-SC. This GMP-SC does the main work. 
It contains GMP-like routines including arithmetic operations, many elementary 
functions and some special functions for floating-point numbers (mpf_t, the orig- 
inal GMP data type), complex numbers (mpc_t), intervals (mpi_t), rectangular 
complex intervals (mpci_t), “large doubles” (large_d, which is a structure con- 
sisting of a double and an integer meaning the exponent) and “large intervals” 
(large_i, which is an interval between two large_d-s). 
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Those special functions that were needed for the above-mentioned project are 
implemented. These are the Gamma function, the complementary error function 
and Hermite functions (see [6] or [32]). 

2.5 Final Remark 

MPFI and GMP-XSC have been developed at the same time. The authors did 
not know about the projects of each other. It is intended to produce one library 
that contains the advantages of both products. 

3 Comparison and Results 

From now on, the focus will be on three packages, one for Maple: intpakX, and 
two C/C++ libraries: MPFI and GMP-XSC. These packages are recent and they 
offer arbitrary precision and the usual set of standard functions. 

They are compared using the following criteria: ease of use, accuracy and 
timing. Before presenting details, let us recall some intpakX features. 

3.1 intpakX Specifics 

The need for symbolic computing is a main reason for using a Maple package, 
while you don’t necessarily use it if you want to do numerical computations 
only. Furthermore, a Computer Algebra System (abbreviated as CAS in the 
following) has to be easy to use to serve its purpose in teaching and as a means of 
confirmation and visualization in attendance of other computing environments. 

Convenience is difficult to measure, but a greater ease of use often comes at 
the expense of less efficiency, so the expectation is that a CAS package might be 
efficient for the CAS in question, but usually slower than a programming library. 
Also, results obtained using the package in a graphical user interface (or GUI) 
will look different from those you get using a command line version of the CAS. 

This has to be considered when you compare the times of the three packages 
mentioned before. Yet, the architecture of the multiple precision arithmetic and 
data type still plays an important role. 

3.2 Accuracy 

In a multiple precision environment, you like to get especially tight enclosures 
of all results. In Maple, you have the possibility to set precision via an environ- 
ment variable Digits. This variable is used in intpakX functions to calculate the 
necessary number of decimal digits for any calculation. In C/C++ libraries, vari- 
able and arbitrary computing precision is also possible: this is achieved through 
dynamic memory allocation to store the numbers. 

The tightness of the results is governed by the way outward rounding is 
performed. With MPFR and thus MPFI, exact directed rounding is done, i.e. 
the resulting intervals are the tightest guaranteed enclosures of the exact results. 
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In intpakX, the resulting intervals are rounded outwardly by 1 ulp, yielding an 
interval with a width of 2 ulps in a single calculation. In any case, the accuracy 
of the result thus only depends on the precision used and on the number of 
calculations done. In the implemented interval methods, the precision is adjusted 
to yield a result with the desired accuracy, and the user can specify the relative 
diameter of the intervals to be computed (or the number of iteration steps to be 
done). Thus, it depends on the settings how tight the resulting intervals are. 



3.3 Timing 

While the quality of results is a feature immanent to high precision arithmetic, 
the question of memory and speed determines to what degree a package can be 
used in practice. The times presented in the tests subsection show how problem 
sizes and numbers of digits can be chosen to get results in reasonable time. 

There is a maximum number of decimal digits predefined in the Maple kernel 
options which is set to 268435448. This is only a theoretical limit to the compu- 
tations done since the tests were done with smaller numbers of digits. The limits 
with MPFI and GMP-XSC are that the exponents must fit into a machine in- 
teger (this limitation should be soon removed from GMP/MPFR) and that the 
mantissa cannot exceed the available memory. 

The following tests were executed with different packages to compare the 
speed of 

— standard Maple arithmetic and interval arithmetic using intpakX; 

— intpakX as a CAS package and programming languages/libraries; 

- MPFR and MPFI; 

- C-XSC and GMP-XSC. 



Test Arrangements 

— In Maple, intpakX results have been compared to non-interval Maple results, 
both with different numbers of decimal digits. 

— The same calculations have been done in C-XSC using real floating-point 
numbers, real intervals and multiple precision intervals (staggered arith- 
metic) with different lengths. 

— They have also been performed using GMP-XSC. 

— Finally, the same set of tests has been done using MPFR and MPFI. 

Two particular tests have been executed: 

1. to test the speed of basic operators, matrix multiplications of different sizes 
and with varying computing precision have been done in the environments 
mentioned; 

2. standard functions have been tested in expressions with single or multiple 
occurrences of different standard functions. 
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Furthermore, the section on applications contains tests on the applications 
included in the intpakX package and various applications either solved by GMP- 
XSV or MPFI or which were the starting motivation for their development. 

More details on the performed tests are presented together with the corre- 
sponding results. 

The results have been measured on a Sun Ultra 10 440MHz computer, except 
the MPFI experiments which have been conducted on a Sun Ultra 5 330MHz, 
and for which a correcting multiplying factor of 330/440 has been applied. The 
software versions used for the computations are Maple8 with intpakX v.1.0, 
C-XSC 2.0 beta2 with GNU g++-3.2, GMP 3.2 with gcc-3.2, and MPFI 1.1, 
based on GMP-4.1.2, with gcc-3.0.3 -02 or g++-3.0.3 -02. All times are 
displayed in seconds. 



Results 

Matrix Multiplications (Maple) 

The following times have resulted from a multiplication of matrices ”by hand” 
(i.e. using 3 nested loops - the absence of overloaded operators in intpakX 
does not allow a direct multiplication of matrices). Different (full) matrices have 
been tested, including the Hilbert Matrix. This implies that the times below are 
not strictly valid for all examples, but show the ratio between non-interval and 
intpakX interval computations. 

The numbers of digits given (15, 30, 90) are related to the corresponding 
lengths for C-XSC real intervals and staggered intervals with 2 or 6 reals (a real 
variable has about 15 decimal digits accuracy) . 



Data Type/Matrix Size 


15 Digits 


90 Digits 


540 Digits 


Maple float 








10x10 


0.08 


0.21 


0.78 


20x20 


0.86 


1.85 


6.86 


30x30 


2.59 


5.75 


25.94 


intpakX interval 








10x10 


2.65 


2.78 


6.72 


20x20 


20.16 


23.38 


63.59 


30x30 


72.46 


81.84 


237.28 



The ratio between interval computations and their floating-point counter- 
parts is given in the following table: 



Matrix Size 


15 Digits 


90 Digits 


540 Digits 


10x10 


33 


13 


8.6 


20x20 


23 


13 


9.3 


30x30 


28 


14 


9.1 



It can be seen that the ratios for the different numbers of digits stay in the 
same range for growing matrix sizes while decreasing with growing numbers of 
digits. 
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Matrix Multiplications (C-XSC) 



Size 


imatrix 


Limatrix (2 reals) 


Limatrix (6 reals) 


20x20 


0.07 


0.15 


0.68 


100x100 


7.92 


16.18 


83.19 


200x200 


63.70 


132.38 


663.07 



Matrix Multiplications (GMP-XSC) 



Size 


15 


30 


90 


540 Digits 


20x20 


0.07 


0.09 


0.09 


0.09 


100x100 


8.19 


9.09 


9.41 


12.83 


200x200 


79.10 


81.60 


86.20 


121.28 



Matrix Multiplication (MPFI) 

Times using MPFR are not reported here. Previous experiments [40] report an 
overhead factor between 2 and 4 for matrix operations. 



Size 


15 


30 


90 


540 Digits 


20x20 


0.01 


0.01 


0.02 


0.03 


100x100 


1.89 


2.16 


3.88 


5.71 


200x200 


15.78 


18.59 


23.99 


47.97 



GMP-XSC is slightly slower than MPFI because the focus was more on spe- 
cial functions with real or complex argument than on sophisticated rounding 
routines (see the remark in Section 2.5). 

If you consider the standard number of 15 digits, times using C-XSC or GMP- 
XSC are about ten times faster than with intpakX, and times using MPFI are 
more than 50 times faster than with intpakX. With growing numbers of digits, 
the increase of times is greater in C-XSC than in Maple or especially in GMP- 
XSC. 

This effect becomes even more visible testing the standard functions. 
Standard Functions (Maple) 

The standard functions were evaluated executing 1000 iterations with chang- 
ing values for x. The computation time for the parameters is included in the 
numbers, but did not account for a major part of the times measured. 

As an example, we give the Maple code for the performed operation (including 
the loading of the package): 

restart ; 

libname : ="/home/wmwr3/grimmer/maple/intpak/new/vl . 0/lib" , libname ; 
with (intpakX) : 



Digits : =90 ; 
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wid: =0 . 001 ; 
imax : =1000 ; 

exprl :=sin(x) ; 

f :=inapply(exprl ,x) ; # convert to interval expression 

sti :=time() ; 
for i from 1 to imax do 
param:=i*0.01 : 
param2 : =param+wid : 
result [i] :=f ( [param , param2] ) : 
od: 

fti :=time() ; 
dti : =fti-sti ; 





Maple float (90 Digits) 


intpakX int. (90 Digits) 


ratio 


sin( x) 


4.63 


19.42 


4.1 


sinh(x) 


2.74 


4.71 


1.7 


exp(x) 


2.60 


4.20 


1.6 



Standard Functions (C-XSC) 





interval 


1 dnterval (2 reals) 


Linterval (6 reals) 


sin(x) 


0.0014 


17.61 


57.20 


sinh(x) 


0.0015 


25.95 


92.53 


exp(x) 


0.0012 


17.74 


78.78 



Standard Functions (Single Occurrence, GMP-XSC) 





15 


30 


90 Digits 


sin(x) 


0.22 


0.30 


0.74 


sinh(x) /cosh{x) 


0.25 


0.35 


0.68 


expix) 


0.16 


0.23 


0.52 



The tables show that on the one hand, C-XSC times using staggered arith- 
metic are much higher even than Maple times and at the same time fast growing 
with increasing numbers of reals in one staggered variable. This shows that the 
C-XSC staggered arithmetic is not efficient being implemented as software only. 

On the other hand, you can also see that standard IEEE arithmetic (as 
used in C-XSC real numbers) is still much faster than GMP multiple precision 
arithmetic with the same number of digits. 

Computing expressions with multiple occurrences of standard functions 
yields similar results (roughly speaking, times add up if you do more than one 
evaluation of a standard function; times thus strongly depend on the expressions 
themselves) . 
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In addition to the results above, here are some more results doing only a 
single evaluation of the standard functions with greater numbers of digits in 
intpakX and GMP-XSC. 

Standard Functions (Maple) 





10000 Digits 


20000 Digits 


40000 Digits 


100000 Digits 


sin( x) 


14.62 


57.25 


196.95 


1586.5 


sinh(x) 


2.92 


10.79 


41.04 


234.03 


exp(x) 


3.28 


12.21 


46.59 


249.05 



Standard Functions (GMP-XSC) 





10000 Digits 


20000 Digits 


40000 Digits 


100000 Digits 


sin 


2.50 


9.80 


39.44 


225.47 


sicoh 


1.18 


4.83 


18.51 


104.12 


exp 


1.15 


4.63 


17.81 


103.38 



Since MPFR is slower than GMP, times are not reported here: it suffices to 
say they are longer. Indeed, the results returned by MPFR are exactly rounded 
results and this can explain the relatively high computing times. MPFI also 
returns the tightest enclosures of the exact results. It has been observed that 
MPFI times are much higher than MPFR times: a possible explanation for the 
trigonometric functions is that argument reduction is performed twice, once by 
MPFR and once by MPFI. But since this phenomenon is also observed for the 
other functions, it is a hint that programming improvements have to be done in 
MPFI. 

Expecting a programming library to be faster, it strikes that the ratios com- 
paring Maple, MPFR and GMP-XSC times are relatively small. The MPFI times 
are even higher. 

Further Remarks 

— Considering the comparison of Maple and intpakX times, we found decreas- 
ing ratios for greater numbers of digits. This can be credited to the fact that 
the additional time for interval computations comprises time for arithmetic 
operations and some overhead time. The influence of the latter decreases 
when more time is used by arithmetic operations. 

— For large numbers of digits, the computation time using the GUI version of 
Maple was significantly higher (up to twice) than using the command line 
version. 

— For periodical functions (sin, cos, etc.) intpakX times are about 5-7 times 
larger than Maple floating-point operations due to a shift of the interval 
bounds and numerous case distinctions. For monotonous functions as the 
exponential function, the factor is approximately 2. The tests included the 
reading of the parameter and storage of the result which resulted in factors 
slightly smaller than 2. 
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Results of two of the implemented applications can be found in the following 
section. 



4 Applications 

In this section we give results of some applications for the interval packages. 



4.1 intpakX for Maple 

intpakX includes some applications of the defined interval types, functions and 
operators. In this subsection, we want to give some numbers to show to what 
extent and up to which level of accuracy the packages can be used conveniently. 

The tested applications are the Interval Newton Method and Range Enclo- 
sure for functions of one real variable. A theoretical foundation has been given 
in [15]. 

The main criterion to be watched was the speed of the application executing 
the algorithms with growing numbers of iterations. 

Here are times for the Interval Newton Method, first testing the computation 
of an interval containing 6 zeros with growing number of digits, then testing the 
computation of a growing number of zeros with constant number of digits (100) 
for sin as an example. 

Interval Newton Method 



Digits 


1000 


2000 


4000 


10000 


Time (secs.) 


79.66 


259.08 


873.620 


5072.57 



Obviously the complexity of operations is quadratic with respect to the 
number of digits used here, whereas it is linear in the number of zeros: 



Zeros 


Iteration steps 


Time 


31 


247 


26.78 


318 


2398 


268.00 


3183 


23243 


2666.71 



Range Enclosure (2D) 

Finally, some times for the range enclosure of a function of one real variable 
are given below, doing different numbers of subdivisions of the starting interval 
(here: evaluation of f(x ) = exp(— x 2 ) *sin(7r*:r 3 ) over the interval X := [0.5,2.]). 



Number of Subdiv. 


5 


10 


15 


Time 


27.89 


437.14 


6834.20 
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4.2 Extended Interval Newton Algorithm 

Interval Newton algorithm [19] has been adapted to arbitrary precision compu- 
tations and implemented, cf. [38]. 

With an interval arithmetic based on hardware floating-point numbers, the 
accuracy of the result is limited; in particular with a root of multiplicity m > 1 
or a cluster of m zeroes, the accuracy on this zero is the computing precision di- 
vided by m. However, interval Newton algorithm is based either on a contracting 
scheme or, if the contraction is not efficient enough, on a bisection. This implies 
that arbitrary accuracy can be reached, if only enough computing precision is 
available. This remark led us to adapt and implement interval Newton algorithm 
in MPFI. 

The adapted interval Newton algorithm exhibits the following features: 

— arbitrary accuracy can be reached both on the enclosure of the zeros and on 
the range of the function on this enclosure, up to computer limits (time / 
memory) ; 

— the computing precision is automatically adapted when needed; this happens 
when bisection is no more possible because the current interval contains only 
two floating-point numbers, or when the function evaluation does not narrow 
when the argument gets narrower. 

Some experiments have been conducted on polynomials [38]. The first series 
concerns Chebyshev polynomials. They are known to be difficult to evaluate 
accurately even if they take their values in [—1,1], because their coefficients are 
large. A consequence is thus that it is quite difficult to get a small “residual” 
F(X), smaller than the stopping threshold £y. For instance, MatLab determines 
only 6 roots of C30, the Chebyshev polynomial of degree 30 (it finds 24 complex 
roots for the 24 remaining ones), with 5 correct decimal digits. It finds only 8 
roots of C26, with 3 correct decimal digits. Yet the coefficients of C26 or of C30 are 
exactly representable by machine numbers and these results are not due to the 
approximation of the coefficients by double precision floating-point numbers. 
The proposed interval Newton algorithm gives very satisfactory results: every 
root is determined, no superfluous interval is returned as potentially containing 
a root and the existence and uniqueness of the roots in each enclosing interval 
is proven, for most of them. 

A second series presents quite the same conclusions obtained with the Wilkin- 
son polynomial of degree 20: W 2 o(x) = ri(=i( ;r— *) written in the expanded form. 
The initial precision is chosen large enough to enable the exact representation 
of the coefficients. This polynomial is difficult to evaluate accurately because 
its coefficients are large (their order of magnitude is 20!) and because it takes 
large values between its roots (their order of magnitude is 10 16 ). Consequently 
it is very difficult for our algorithm (essentially very time-consuming) to discard 
intervals not containing zero. The results are thus small enclosures for the roots 
along with a proof of their existence and uniqueness and a long list of other, not 
discarded, intervals, covering almost the whole interval [l,n]. 
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When the coefficient of A' 19 is perturbed by the interval [— 2 -19 , 2 -19 ], every 
point between 8 and 20 is a root of a perturbed polynomial belonging to this 
interval polynomial; indeed, our algorithm returns small enclosures for the roots 
1 to 7 and a covering of [7.91, 22.11]. 



4.3 Numerical Linear Algebra 

Nowadays, algorithms for solving systems of linear equations with result guar- 
antee are very refined. If, however, the condition number of the involved matrix 
is large, the use of refined techniques but ordinary floating-point calculations 
usually does not help. One example is the Hilbert matrix: 

Hn'.= ( 1 ) 

Its condition number is about 3.5™. Hence there is little hope to get the validated 
inverse for large n by using double precision numbers. A further problem is that 
we usually do not have to invert the Hilbert matrix but some other matrix 
with unknown, possibly large condition number. This calls for using multiple 
precision interval arithmetic. The user may choose the precision in advance but 
the inversion routine doubles the precision until it either produces the inverse 
matrix or reaches a user defined maximal precision. 

The used algorithm is well-known (see Rump [42]). In case we want to solve 
a system of linear equations, 



Ax = 6, A G R rax ™, b G R”, 

we first compute an approximate inverse R by, say, the Gaussian algorithm and 
an approximate solution x. If the entries of A are intervals, we take the respec- 
tive midpoints and compute the approximate inverse of the resulting matrix. 
Introducing y = x — x, we can rewrite the system as 

y = R(b — Ax) + (I - RA)y =: f{y). 

Thus, we can start a fixed point iteration for /. This converges if the spectral 
radius of I — RA is smaller than 1. If R is close to the inverse of A , this spectral 
radius is close to zero and we have fast convergence. 

Inversion is done in the same way. We just have to replace b G R™ by the 
n x n identity matrix. 

On a usual PC, the limits on n are not given by the increase of computation 
time but mainly by the size of the memory. In Table 1, we list the computation 
times t (in seconds on a 2.6 GHz Pentium) used for inversion of the n x n 
Hilbert matrix for certain values of n. The number of used binary digits in the 
computation was 32 • [_ll(n + 2)/32j. The precision of the output is measured 
by diam([U“ 1 ]), the maximal diameter of an entry in the computed enclosure 
for H-\ 
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Table 1. CPU time, number of used binary digits, diameter of the result. 



n 


time (s) 


d 


diam([R~ 1 ]) 


16 


0.074 


176 


0.37- 10'^ 


32 


0.91 


352 


0.64- 10'^ 


64 


18.45 


704 


0.17- 10' al 


128 


367.5 


1408 


0.47- 10' 4S 


256 


8740 


2816 


0.16 • 10' 8U 



The precision for n € {128,256} can be relaxed slightly to gain some speed, 
n = 256, e.g., was also tested with 32 • |_10(n + 2)/32j binary digits in the 
computation. Computation time was about 7402 seconds but the diameter was 
> 1CT 6 . 

Remark 1 There are benchmark competitions of supercomputers based on the 
inversion of very large matrices. It is, however, said explicitly that the produced 
matrices may have nothing to do with the true inverse. On the Dagstuhl con- 
ference, which underlies these proceedings, U. Kulisch proposed to introduce a 
benchmark test, which consists of the inversion of the 500 x 500 Hilbert matrix 
with a certain number of guaranteed correct digits. Now, we know at least the 
correct result for Hf 5 { up to absolute precision of 80 digits. 

4.4 Kronrod-Patterson Quadrature 

Kronrod-Patterson quadrature formulae 

nk 

Qn k P ’ k lf] = E «l fcl /(4 fel ), - 1 < 4 fcl < 1 

v = 1 

for the determination of /[/] = f_ 1 f(x) dx are defined as follows. Let be the 
Gaussian quadrature formula with n nodes. 

1. QKPfi = QG 

q r\KP,k 

a) involves nk = 2 fc (n + 1) — 1 nodes including all those from jf _1 

b) yields the correct integral value for all polynomials of degree < 3 • 
2 fc-1 (n + 1) - 1. 

We call Q^, P ' k+1 a Kronrod-Patterson extension of Qff p ' k . Not even the exis- 
tence of Kronrod-Patterson extensions for k > 1 has been proved theoretically. 
Nevertheless, it is one of the standard methods for numerical integration. Using 
interval arithmetic, it is possible to give an existence proof and to determine 
nodes x^ and coefficients a\^ . We sketch the method: 

nk 

p M{x)= l[(x-xW). 

I/=l 
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Property 2b) is equivalent to 



p^ k \x)q(x) dx = 0 



for all q € P 2 fc-i( n +i)_i 



(2) 



(see [8, Theorem 55]). The initial quadrature formula is the Gaussian. The nodes 
are the zeros of a Legendre polynomial, which can be evaluated easily (for vali- 
dation, we strongly recommend the use of its Clrebyshev expansion and to use a 
stable evaluation of Clrebyshev polynomials T n (x) = cos(n arccos x), see below). 
Now, given ph\ we want to determine pl fc+1 1 . Since Qn k P,k+1 uses the same nodes 
as Qn^' k > is a polynomial. We therefore write (2) for ph +1 1 as 

[ P [k] ( x ) P \ k ]/^ Tx ^ dx = 0 for A = 0, 1, . . . , 2 fc_1 (n + 1) — 1. 

J - i pW{x) 

Expanding p^ k 1 and pl fe+1 l /ph 1 in terms of Clrebyshev polynomials, we obtain a 
linear system for the Clrebyshev coefficients of p^ k+l \x) / p^ k \x) , which can be 
solved with the methods described, e.g., in Section 4.3. Knowing these coeffi- 
cients, we can use Newton’s method to determine the nodes x^ k+1 \ Finally, we 
determine the Clrebyshev coefficients of ph +1 1 in order to allow the next step 
and to determine the coefficients al fe+1 ^ . 

Besides numerical linear algebra, the procedure requires the stable (and fast) 
evaluation of Clrebyshev polynomials. Such a method can be based on T 0 (x) = 1, 
Tf (x) = 1 and the recurrence relations 

T 2 u(x) = 2T u( x ) ~ T 2v+ i{x) = 2 T„ +1 (x)T„(x) - T^x). 

Clrebyshev polynomials of the second kind are treated similarly. 

Not only the existence, but also the positivity of a quadrature formula, i.e. , 
the positivity of its coefficients a v (in our case ) is important. From theory, 
many nice properties follow from this positivity (see, e.g. [9]). 

The presented iterative method is very sensitive with respect to perturba- 
tions in an early step. Numerical validation therefore requires high precision 
arithmetic. 

Existence and positivity are proved by computing the enclosures for nodes 
and coefficients. Non-existence may have different reasons. In our cases, it was 
proved by showing that p^ and its first derivative have the same sign at — 1. 
Hence, there must be a zero of p^ or its first derivative on the left of the basic 
interval, which means that we do no longer have the full number of zeros in 
[-1,1]- 

We have tested the program for < 1024. Again, the restrictions on ilk 
came from restrictions on the sizes of the matrices in the corresponding linear 
systems. The results are 



Theorem 1 The Kronrod-Patterson extensions with nk < 1024 for no ^ {2,4} 
exist and are positive. If n = no = 2 (or n = Uq = 4} , we have existence and 
positivity for n^ <47 (or n < 319} as well as non-existence for nj. = 95 (or 
rife < 637, respectively). 
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4.5 An Oscillating Integrand from Mathematical Finance 



Starting point of GMP-XSC was the numerical computation of the price of an 
arithmetic-average Asian option according to Schroder’s integral representation 
[45]. The computationally complicated part is 



E 









cosh y 

VA 



e yb S 



Att b 



erfc 



y + bh + in 

V2h 



dy (3) 



where ly q and h are certain positive parameters. is a Hermite function, which 
is defined for negative y by 



i r°° 

H “ {z) = n=sl < 4 > 

(see, e.g. [32]). From this, we get all Hermite functions by applying 
-ff/H-iW = 2 zH^z) - 2yH /J ,_ 1 (z). 



3 denotes the imaginary part and erfc is the complementary error function, 



erfc(z) 




dt. 



Properties of these two special functions are given, e.g, in [6] and [32]. 

The main difficulty is that, due to the oscillatory nature of the integrand, 
the complete integral is smaller than the maximum of the integrand by a factor 
of 1/10 to the power of dozens or even hundreds. This required a validated 
error control with the help of automatic differentiation combined with interval 
computations or complex interval computations. Evaluation of the integrand 
requires the computation of special functions (partially or non-real arguments) 
with interval arithmetic. This lead to the features that are incorporated in GMP- 
XSC up to now. 

Details are given in [35]. 



4.6 Global Optimization: Some Difficult Cases 

For one of the authors, a motivation to work on multiple precision interval arith- 
metic came from difficulties encountered with the global optimization of some 
’’nasty” functions. 

Interval arithmetic is the arithmetic of choice to do global optimization of 
continuous functions which are not necessarily convex. Indeed, it provides global 
information on the function, such as an enclosure of its range over a whole (in- 
terval) set. On the opposite, deterministic classical numerical algorithms provide 
an optimum which is guaranteed to be global only under some stringent condi- 
tions. As far as probabilistic methods are concerned, they return an optimum 
with prescribed probability to be close to the global optimum, but which is not 
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guaranteed. Interval algorithms, such as Hansen’s algorithm [18,22], have been 
developed in order to determine the guaranteed global optimum of a function. 
These methods can be costly in terms of computational time and memory. 

However, even interval arithmetic can fail to determine the global opti- 
mum of some functions. Indeed, the functions which are difficult to optimize 
can be roughly classified into two types. Some functions are extremely flat, cf. 
the Ratz 8 function represented on the left of figure 2. With flat functions, 
using double floating-point precision, the optimum is very well approximated: 
[0.00000, 1.00564.E — 08] for the Ratz 8 function (cf. [48]), but the optimizer is 
not accurately determined; a whole region containing points where the func- 
tion takes values close to the optimal one is returned: [0.93750, 1.09375] 9 x 
[—10.0000,10.0000] for the Ratz 8 function. 

Other nasty functions are ” egg-box” functions; these functions have a huge 
number of local optimizers, such as the following functions: the Levy (n° 3) 
function on [—10, 10] 2 (cf. right part of figure 2) defined as 



f(x,y) = ~ [5^icos[(*-l)a; + i] ) x ( ^jcos[(j + l)y + j] 



vi=l 



has 760 local minima, 18 global minima; with n = 10, the following function has 
10 10 local minima and only one global minimum: 



n— 1 



f(x 1 , ...x n ) = 10sin(7ra:i) 2 + (x n - l) 2 + - 1) 2 [1 + 10sin(7ra; i+ i) 2 ]. 



For such functions, the program usually runs out of memory: a huge list of 
intervals which are potential optimizers is kept; the program does a ’’best first” 
search and subdivides a lot of these candidates, but it does not manage to discard 
them. 

Furthermore, the local optima can be very close to the global one, which 
means that the interval algorithm cannot discard them. An example can be found 
in chemistry, with a problem of molecular conformation [34,46]: the problem 
is to determine the localization of particles, through the minimization of the 
electrostatic energy of the system. More formally, the problem is to determine 
the global minimum of 



EE 



i 

d{Xi,Xj) 



where X, and Xj are the locations of particles i and j, and d is the Euclidean 
distance, subject to X; lies on the unit sphere. This problem takes values 
ranging from the global minimum to the infinity (when two particles are located 
at the same place): this means that multiple precision can help to magnify the 
difference between local and global minima. Furthermore, the number of local 
minimizers is huge and it is impossible to gather them into a single region, 
since every local minimizer is isolated. The memory needed to store the list of 
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potential optimizers is thus large. It is a modern challenge to determine and 
prove the optimality of configurations with over 120 particles. 




Fig. 2. Ratz 8 and Levy (n° 3) functions 



The global optimization of such functions can greatly benefit from multiple 
precision interval arithmetic. The development of a dedicated software is an 
ongoing work. 

5 Availability 

The current software packages, corresponding documentations and application 
programs are available through the internet. 

5.1 intpakX 

This Maple package is available on 

http : / /www .math . uni-wuppertal . de/wrswt/ software/ intpakX/ 

together with some documentation and examples. 

It is also available as ” Research Powertool Interval Arithmetic ” from Water- 
loo Maple™ on 

http : //www.mapleapps . com/powertools/ResearchApplication . shtml. 

5.2 MPFI 

MPFI is a C package. It is available on http://perso.ens-lyon.fr/nathalie. 
revol/software . html: it includes a documentation, the source code and 
some rudimentary tests. This software requires a C compiler, GMP (which 
can be downloaded at http://www.swox.com/gmp/) and MPFR. (available on 
http : / /www .mpf r . org/). 
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5.3 GMP-XSC 

This package is available on http://www.tu-bs.de/~petras/software.htrnl, 
where installation and usage is described. This software requires C, C++ and 
GMP. The latter is often part of LINUX distributions or may be obtained via, 
e.g., http://www.swox.com/gmp/ 

The applications mentioned in sections 4.3, 4.4 and 4.5 can also be found on 

http : / /www . tu-bs . de/~petras/sof tware . html. 

6 Conclusion 

This paper presents a survey of existing packages for multiple precision inter- 
val arithmetic. Details are given for three packages: intpakX for Maple (which 
focuses on ease of use), and MPFI and GMP-XSC for C/C++ (which focus on 
efficiency and reliability through the use of a programming language). These 
three packages have been compared in Section 3. 

The results show that getting tight and guaranteed results may sometimes 
take a lot of time, especially if a program is designed to be easy to use. This 
particularly applies to the standard functions which have to be further optimized. 
Yet, it is expected that multiple precision interval arithmetic will be more widely 
used in the future, since various complete, easy and rather efficient packages are 
now available. We hope that input from an increasing number of users will help 
improving our packages. 
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Abstract. The COSY Infinity software package by Berz et al. is widely 
used in the beam physics community. We report execution-based test- 
ing of its interval and Taylor model arithmetics. The testing strategy is 
careful to avoid contamination by inevitable rounding errors. Tests were 
ported to Sun’s F95 and INTLAB. In each package, we uncovered vio- 
lations of containment which have all been corrected by their authors. 
We encourage users of COSY and most other software packages to check 
author/ vendor web sites regularly for possible updates and patches. 



1 Testing COSY’s Interval Arithmetic 

During Spring 2002, the reliable computing email list reliable_computing@ 
interval . louisiana. edu had an active discussion of COSY Infinity [1,9] (Berz 
et al., available from http://cosy.pa.msu.edu [2]). COSY Infinity is an arbi- 
trary order package for multivariate automatic differentiation and interval and 
Taylor model arithmetic. It can be used in an interpreted version, which we 
tested, in a compiled version from Fortran 77 and C programs, or through objects 
in Fortran 90 and C+- K The reliable_computing discussions raised concerns 
about the reliability of interval and Taylor model arithmetics, so Berz commis- 
sioned the execution-based testing of COSY interval arithmetic we report here. 
We also applied our tests to Sun Microsystems’ Fortran 95 [10] and Rump’s 
INTLAB for MATLAB [13,14,15]. 

Testing software is challenging. Myers summarizes testing philosophy, “The 
purpose of testing is to find errors” [11]. Kit [8], Kaner et al. [6], or Whittaker [16] 
offer best practice in industrial software quality assurance. 

Authors of many packages for interval arithmetic have tested their work, but 
there is little literature describing those tests. In TOMS 737 [7], Kearfott et 
al. tested their Fortran 77 INTLIB arithmetic operations with a combination 
of specially constructed and randomly generated arguments. Corliss [4] gave a 
suite of programs for “testing” environments for interval arithmetic for usability 
and speed. Sun Microsystems says their Fortran 95 interval elementary function 
library has undergone exhaustive testing, which is confidential. 

The focus of this paper is on the testing of COSY’s interval and Taylor model 
arithmetic. Since we found little methodological discussion in the literature, we 
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developed testing methods that could be applied more generally. Besides the 
testing of COSY, we applied our methods also to Sun’s F95 and INTLAB pri- 
marily to validate our testing methods. The testing methods have wider utility, 
but our focus is execution-based testing of COSY. 

2 What Is “Correct?” 

The fundamental tenet of the interval community is, “Thou shall not lie!” It is 
an error to i) violate containment or ii) assert a mathematical falsehood. Our 
testing exposed violations of containment for 

1. COSY: power when the exponent is not an integer, but very close to it. 

2. COSY: (with warning) tan when the interval argument crosses discontinuity. 

3. INTLAB: sqrt for most arguments. 

4. Sun F95: tanlr for many negative arguments. 

5. COSY Taylor models: sin, asin, and acos. 

We give details of errors we found in Sects. 5 and 9. On the other hand, questions 
of appropriate domains for interval operations, tightness of enclosures, speed, 
and ease of use are not considered errors, but may represent opportunities for 
improved performance. We raise some of those issues in Sects. 6, 7, and 8. 

3 Test Strategy 

To complete the testing in a timely manner, we accepted a very narrow scope. 
We tested the arithmetic operations unary and binary addition and subtraction, 
multiplication, and division, and the intrinsic functions power, sin, cos, tan, 
asin, acos, atan, sinh, cosh, tanlr, log, exp, sqrt. sqr, and isqrt. Our goal is to 
identify i) violations of containment or ii) assertions of mathematical falsehood. 
We developed a set of test cases consisting of an interval vector [x\ and an 
expression f(x). Expected results are computed a posteriori in Maple. We did 
not attempt testing of other features of COSY including its linear dominated 
bounder, shrink-wrapping, or ODE solving. 

We denote by [/([a;])] the result of challenging the interval arithmetic to 
evaluate / on the interval [x\. We seek examples x G [x\ for which f{x) is not in 
[/([ar])] . We do not need to know the true containment set of f{[x}). Instead, we 
use Maple as the “referee” of containment. We 

1. Read each test case into a COSY driver; 

2. Construct COSY intervals for the arguments; 

3. Evaluate the expression using COSY interval arithmetic; 

4. Write binary values of the arguments and the COSY result; 

5. Read the binary arguments and COSY results into Maple; 

6. Perform many point evaluations f(x) for x G [&]; 

7. Compare Maple’s f(x) with COSY enclosure. 

The most challenging aspect of conducting the tests was to prevent inevitable 
roundoff errors from contaminating our results. 
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3.1 Roundoff Errors 

Suppose we wish to test sin on [0.1, 0.6] on an IEEE arithmetic machine. Fun- 
damentally, it is impossible, since 0.1 and 0.6 are not exactly representable. We 
cannot even express the question, “What is sin [0.1, 0.6]?” 

Roundoff errors may be introduced into tests of interval software when we 

1. Read test cases into the test driver; 

2. Construct interval (s) for the arguments; 

3. Extract interval bounds from arguments and results; 

4. Write arguments and results to a file; 

5. Read arguments and results into Maple; 

6. Construct Maple variable precision representations; 

7. Perform Maple operations; 

8. Report from Maple. 

Table 1 suggests the schematic flow of the testing process. It shows the commu- 
nication from COSY to Maple of both the exact (binary) argument (s) used to 
challenge each arithmetic operation or intrinsic function and the exact (binary) 
result computed by COSY. We consider each potential source of roundoff error 
in turn. The issue is not with Maple. Issues 1-3 concern COSY. Issues 4-6 
concern communication between any pair of dissimilar software packages. 



Table 1. Schematic of the flow of testing 



COSY 

Enter [0.1, 0.6] 

JJ. round near 
[ internal IEEE 754 ] 

JJ. INTV round out 
INTV( • • • ) 

JJ- / round out 

11 

funct(- • •) 



Maple referee 



JJ. multiprecision 
*£[■■■] 

JJ. / multiprecision 

f(x) 

enclosed in? 



Read Test Cases into the Test Driver. We must separate the testing of 
the input and output routines from the testing of the operations. Our goal is 
to test the operations of interval arithmetic. We read files of test cases into a 
test driver. We view the internal binary values as truth, while the ASCII values 
in the file are viewed as approximations. In the few cases where the difference 
matters, we use test arguments that are exactly representable in binary, and we 
check whether they are read exactly. 

Construct an Interval. COSY’s interval constructor INTV () by default adds 
one ULP outward to its arguments to compensate for assumed possible inward 
rounding in assigning their values. We tested COSY using the INTVQ construc- 
tor to model usual use. Using INTVQ prevented us from testing cases such as 
asin([l, 1]) because INTV(1.0, 1.0) contains points at which asin is not defined. 




94 



G.F. Corliss and J. Yu 



Extract Interval Bounds. After challenging COSY’s interval arithmetic, we 
write the challenge arguments and the COSY results to a file. We use the COSY 
functions INL() and INUQ to extract the lower and upper bounds, respectively, 
of the COSY intervals. We verified by both execution testing and by direct code 
inspection that INL() and INUQ return their respective values with no rounding. 

Write Arguments and Results to a File. To avoid roundoff in writing the 
challenge arguments and COSY’s results to a file for reading by Maple, we write 
them in binary form, either big endian or little endian, depending on the host 
testing platform. 

Read Arguments and Results into Maple. We read binary representations 
of the challenge arguments and the COSY results into Maple. The binary rep- 
resentation is system dependent, so we used different functions to read binary 
files written in big or little endian formats. We verified the correct transmission 
of values by comparing HEX dumps from COSY, Maple, and DOS’s debug. 

Construct Maple Variable Precision Representations. We read the bi- 
nary file into Maple using 900 decimal digit arithmetic. Maple’s binary read and 
convert to decimal is not accurate (previously known) , so we wrote our own bi- 
nary read in Maple, reading each byte as an integer and reassembling the IEEE 
representation using Maple’s 900 decimal digit arithmetic. 

Perform Maple Operations. We used 900 decimal digits to ensure that the 
full dynamic range of 53 bit mantissa IEEE double precision numbers is exactly 
representable in the decimal form used by Maple’s variable precision arithmetic. 
Even if Maple’s variable precision arithmetic were not accurate in the last few 
digits, we are safe, since we are detecting violation of containment errors in about 
the 14 - 17 tlr decimal digit. 

Is 900 decimal digits “large?” No. In order for our logic to hold, we must 
ask Maple to evaluate the sin at exactly the same endpoints with which we 
challenged COSY. We have “exact” in the form of binary values. 53 binary 
digit numbers can be exactly representable in a finite number (56) of decimal 
digits (not the other way around). Representing the full range of IEEE double 
precision numbers, about 10 -308 to 10 308 , requires another 617 decimal digits. 
To get Maple to evaluate sin at INF(X) and SUP(X) as evaluated by COSY, 
we must use at least 673 decimal digits in Maple. 900 gives a margin of error in 
case Maple’s last few digits might be in error, of which we saw no evidence. In 
practice, we saw some incorrectly diagnosed “failures” using 100 decimal digits, 
but not with 200 digits. 

Violations of containment are detected in Maple by comparing Maple’s 900 
digit evaluation of f(x ) with COSY’s enclosure. If a violation of containment 
were due to a rounding error in Maple’s evaluation, the failure of containment 
would be in the last few of the 900 digits, and increasing to 1000 or more digits 
would resolve them. In all violations of containment we observed, the failure was 
of approximately the accuracy of double precision computation, and increasing 
the number of digits had no effect. 
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In each violation of containment detected by Maple, careful human exam- 
ination of the test case confirms that reported violations of containment truly 
represent a failure of the software under test. We used Maple for its arbitrary 
precision capabilities to detect the errors, but once found, errors are visible to 
the reader in this paper or in COSY execution with no need to rely on Maple. 

Report from Maple. Values printed by Maple are subject to rounding error 
on output, but all of our conclusions have been drawn using internal Maple 
representations. Any Maple output rounding has no effect on our conclusions. 

3.2 Test Cases 

We tested 30 multi-operation expressions, but if an arithmetic package gets indi- 
vidual operations and intrinsic functions right, it will get complicated expressions 
right, too. Hence, we tested primarily 2,600+ expressions composed of a single 
operation or intrinsic function. 

For elementary operations, no matter how wide the arguments, extrema occur 
at the endpoints, except for division by intervals containing zero. Similarly for 
intrinsic functions, extrema are always at the endpoints, except for a modest set 
of exceptions (e.g., sin and cos for arguments that span 7r or 7 t/ 2), which we 
enumerate and test. Hence, we are most likely to find violations of containment 
at endpoints of the challenge arguments. 

Our Maple “referee” checks interior points, but we observed no failures at 
interior points. For each test case, we have Maple evaluate the expression under 
test at 11 points in the challenge argument interval using 900 decimal digit 
approximate arithmetic, as illustrated in the pseudo-code below. All errors we 
found would have been detected using only two points in the challenge interval. 
If /( x) is not in COSY’s result interval, we have a likely violation of containment, 
which we verify by human inspection of results as described in Sect. 5. 

for (i = 0; i <= 10; i++) { 

y = INF (X) + (SUP(X) - INF (X) ) * i/10.0 
fx = f (y) 

ERROR if fx is outside COSY result 

} 

We might look at extrema of the function, check at randomly chosen points, 
or at far more points. There are separate test cases to challenge evaluation within 
one ULP of extrema, so checking at extrema is already covered. Random tests 
are rarely as effective at uncovering errors as carefully constructed challenges; 
our test cases uncovered all the errors we found. None of our 500,000 random 
tests uncovered an error. Similarly, we had checked at 10,000 points (vs. 11) 
early in our testing, but all the errors we found at endpoints. 

Most of our test cases came from TOMS 737 [7]. Kearfott et al. tested their 
Fortran 77 INTLIB interval arithmetic operations with a combination of spe- 
cially constructed and randomly generated arguments. We added a few specially 
constructed arguments of our own and 30 multi-operation expressions taken from 
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tests of a validated quadrature package by Corliss and Rail [3]. In general, we 
expect interval arithmetic most likely to fail for very large or very small (in either 
absolute or relative terms) domain or range values, near boundaries of domains, 
or near underflow or overflow. 

To increase the coverage of our tests of binary operations, each pair of argu- 
ments was used in several combinations. For example for addition and subtrac- 
tion, argument intervals [a] and [6] give test cases 

~ [a] + [b], [a] - [b], [-a] + [b], [-a] - [b] 

— [-«] + [~b], [-°] - [~b], [°] + [~ b i [«J - [ ~ b '] 

— [b] + [a], [ b } - [a], [~b] + [a], [ -b } - [a] 

— [-b] + [-a], [~b\ - [-a], [b] + [-a], [b] - [-a] 

For multiplication, with 0 < [a, o] and 0 < [b, 5] , we test 16 combinations: 

— [a, a] x [&, b ] , [—a, a] x [6, 6] , [—a, —a] x [6, b \ , [—a, a] x [b, 6] 

— [a, a] x [—6, b\ , [—a, a] x [—6, 6] , [—a, —a] x [—b, b \ , [—a, a] x [—6, 6] 

— [a, a] x [—6, —6] , [—a, a] x [—6, —6] , [—a, —a] x [— b , —b\ , [—a, a] x [—6, —6] 

— [a, a] x [—6, 6] , [—a, a] x [—6, 6] , [—a, —a] x [—6, 6] , [—a, a] x [—6, 6] 

and similarly for division. In addition, we constructed more than 500,000 random 
tests that discovered no additional errors: 

loops for i and j 

a = RANDO ; b = RAND ( ) ; 
xl := +- 0 . a * 2~+-i; 
x2 := +- O.b * 2~+-j ; 

[X] := [xl, x2] ; 
expr (X) ; 

4 Test Environment 

Our tests of COSY and INTPAK were executed on an HP notebook PC N5270 
with a 700 MhZ Pentium III processor, 128 MB RAM, and a 20 GB hard disk 
under Microsoft Windows ME. The tests were replicated on a Toshiba Satellite 
4090XDVD with an Intel Celeron at 400 Mhz, 128 MB RAM, running Windows 
98. Our tests of Sun Workshop 6 were conducted on a Sun Enterprise 250, 
UltraSPARC 3 with one CPU at 450 Mhz with 512 Mb RAM. We tested 

— COSY version 8.1 (updated June 8, 2002) downloaded from 

www.cosy.pa.msu.edu on June 25, 2002. The tests were repeated on 
a modified version of COSY provided on May 2, 2003. 

— Sun Workshop 6 update 1 Fortran 95 6.1 2000/09/11 (from f95 -V . . .). The 
tests were repeated with a patched version released in September, 2002. 

— INTPAK version 4.0, www.ti3.tu-harburg.de/~rump/intlab downloaded 
on January 15, 2003. The tests were repeated on Version 4.1.1 downloaded 
on January 22, 2003. 
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We used Maple 6 and MATLAB version 5.2. In Maple, we use little beyond the 
underlying variable precision arithmetic, so newer versions should have no effect 
on our tests. The error in INTPAK was traced to an anomaly in MATLAB which 
might be changed in a later version, although Rump observed the same anomaly 
in the current MATLAB version as of January, 2003. 



5 Test Results 

In this section, we report the results of our tests. In Sect. 3.2, we claimed to 
have verified suspected errors by human inspection. In this section, we offer the 
errors for inspection by the reader. Maple found the errors, but the reader can 
see them with no dependence on Maple. 

5.1 COSY: POWER Near an Integer 
Test case (ASCII): [2.0, 2.0] 1 00000000001 

As presented to COSY: 2 1 00000000001000000000827 - ■ (approximate decimal rep- 
resentation of binary value) 

COSY result: [1.999999999999999555 . . . , 2.000000000000000444 . . .] (approxi- 
mate decimal representation) 

Maple’s /(#): 2.0000000000138... (approximate decimal representation), 
which violates containment by about 10~ n . 

Cause: The POWER operator was intended only for internal use by COSY 
for integer and half-integer exponents. Exponents within 10“ 10 of an integer or 
a half-integer are rounded to the nearby integer or a half-integer. Exponents 
further from an integer or a half-integer are rounded with a warning message. 
Solution: COSY authors removed the POWER operator from the list of user 
callable operations. 

5.2 COSY: TAN Crossing Discontinuity 

Test case (ASCII): tan([1.0, 2.0]) or tan([1.0, l.QE + 30]) 

COSY result: Print a warning and return [— 1.0E+35, 1.0E-I-35] , which violates 
containment at points very close to 7t/2. This is a problem if the user ignores 
the warning, or if the warning scrolls off the screen. 

Cause: COSY correctly recognized that the challenge argument includes a sin- 
gularity, but it returned finite bounds. 

Solution: COSY authors modified COSY so that after the warning is printed, 
execution halts. 

5.3 COSY: ASIN or ACOS at ±1 

Test case (ASCII): asin(l), asin([— 1.0, 1.0]), or similarly for acos. 

COSY result: Messages “asin(l) does not exist, ” and “asin([-l, 1]) does not 
exist,” respectively. These assert mathematical falsehoods. 
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Cause: COSY’s interval constructor INTVQ outwardly rounds the intervals [1, 
1] and [-1, 1], even though their endpoints are exactly representable. Hence, 
COSY correctly detects that the challenge argument includes points outside 
the domain of asin. The default output routines in the test environment round 
endpoints as printed in the message, although other environments printed more 
digits, so the message was correct as printed. 

Solution: COSY authors changed the formating of the message to read, “arcsin 
does not exist for the interval [0.999999999999999,1.000000000000001].” 

5.4 Sun F95: tanh (Negative) 

To validate the testing methodology, we re-wrote the same test battery for Sun’s 
F95 compiler. For challenge arguments less than about -4, e.g., tanh ([-4.879, 
-4.267]), containment fails by 1-2 ULP’s. 

Cause: There was a discrepancy between production and development versions. 
Solution: Sun corrected the problem within one week, releasing an update. 

5.5 INTLAB: sqrt 

To further validate the testing methodology, we re-wrote the same test battery 
in Matlab for Rump’s INTLAB. For the sqrt function, every degenerate interval 
fails by one ULP, and most thick intervals fail. 

Cause: MATLAB’s sqrt is not the IEEE sqrt. It uses round to nearest, rather 
than the current rounding mode. 

Solution: Within a day, Rump posted a corrected version of INTLAB using its 
own rounding control for sqrt. 

6 Domains: Opportunity for Improvement? 

When a package for interval arithmetic encounters arguments outside the math- 
ematical domain, it can respond by 

1. Continue execution with empty, NAN, over/undcrflow, or other special value 

2. Consider /([#]) as f([x] fl domain of /) (Sun’s approach) 

3. Halt execution, possibly with an error message (COSY and INTLAB) 

As originally tested, COSY was not consistent in its handling of arguments 
outside the mathematical domain. Those inconsistencies have been corrected by 
the COSY authors. 

COSY considers it a fatal error to evaluate outside the domain of an expres- 
sion, e.g., asin(l) or sqrt(0). These examples are outside the domain because 
COSY enlarges the intervals on construction. Sun’s F95 “handled” many cases 
COSY did not. For example, Sun considers sqrt ([-1, 1]) to be [0, 1]. 

We suggest handling of domains as an opportunity for improvement. We 
found no further violations of containment, and we understand why COSY treats 
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asin(l) or sqrt(O) as fatal errors. However, we would consider it an improvement 
if COSY were able to evaluate such cases correctly. 

Sun’s csets (containment sets) represent Sun’s effort to handle domains. Csets 
are based on an elegant theory, but their implications are not well understood by 
the interval community. For example, Neher has given an example f{x) = yfx + 
1/2 = 0 on [-4, 4]. Naive cset evaluation gives /([— 4,4]) = [1/2, 5/2] C [—4,4], 
incorrectly suggesting the existence of a fixed point. Cset evaluation appears to 
require independent verification of continuity, which is done implicitly in some 
systems for interval arithmetic. 

7 Tightness: Opportunity for Improvement? 

COSY makes many compromises for efficiency over tightness of the intervals. 
For example, the COSY interval constructor INTVQ rounds endpoints outward, 
while Sun’s F95 and Rump’s INTLAB provide interval constructors that accept 
strings and round outward only when necessary to guarantee containment. 

We compared the excess widths of the COSY, Sun F95, and INTLAB results 
across our test cases. Table 2 shows the number of Units in the Last Place 
(ULP’s) the interval result is wider than the Maple result, the interval computed 
by Maple in 900 decimal digit arithmetic. Compared with IEEE double precision 
computed by COSY, the Maple result is a very good approximation to the true 
result. We do not have exactly the correct number of ULP’s in every case, but 
we do have a reliable measure of excess widths. Suppose (in pseudocode) 

tu = Maple upper bound of the result 

Cjj = upper bound computed by COSY ( tjj < cjj) 

ru = cu — tu 

if tu = 0 then ru = ru * 2 1022 else ru = ru/\tu\ * 2 52 
Similarly for the ULP’s at the lower bound ru 
Add ru + ru ULP’s at lower and upper bounds 

For example, consider [1,2] + [3,4] = [4,6]. The COSY result is 

[FC FF FF FF FF FF OF 40 08, 04 00 00 00 00 00 18 40 08] (hex) 

= [3.999 999 999 999 998 223 ..., 6.000 000 000 000 003 552 ...] , 

which is eight excess ULP’s because the constructors INTV(1.0, 2.0) and INTV 
(3.0, 4.0) round out, and the operator ADD rounds out further. Sun’s F95 and 
INTLAB give excess widths of zero ULP’s for this example. The excess widths 
in ULP’s can be large when the true answer is near the underflow limit. 

Table 2 shows the number of test cases for which the interval result had 
excess widths shown. Smaller excess widths are better, so it is better to have 
more test cases with excess widths of 0 - 2 and fewer test cases with larger 
excess widths. The first row in Table 2 shows that COSY computed the tightest 
possible enclosure (zero excess width) in 33 test cases, while F95 and INTLAB 
were as tight as possible in 1277 and 1201 test cases, respectively, from the total 
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Table 2. Excess width in ULP’s 





COSY 
.June ’02 


COSY 
May ’03 


Sun 

F95 


INTLAB 
ver. 4 


0 


33 


33 


1277 


1201 


1 


1 


1 


697 


607 


2 


81 


79 


251 


292 


3-4 


746 


746 


26 


147 


5-8 


906 


906 


1 


9 


9-16 


194 


190 


0 


2 


17-32 


151 


129 


0 


0 


33-64 


17 


15 


0 


0 


65-128 


6 


6 


0 


0 


129-256 


14 


14 


0 


0 


257-512 


12 


12 


0 


0 


Total 


2161 


2130 


2252 


2259 



of 2,600 test cases. Test cases with no finite true result, with true result zero, 
or with underflow or overflow are excluded, leading to different numbers of total 
test cases reported for each package. 

Loss of tightness is not an error, but it is an opportunity for improvement, 
possibly at the expense of speed or portability. The Sun and INTLAB results in 
Table 2 show that increased tightness is achievable. 

8 Speed: Opportunity for Improvement? 

We prefer fast programs to slow ones, but unbiased, comprehensive speed testing 
is difficult and controversial. Speed is not in the scope of our tests, but we have 
run programs implementing the same test cases in different environments, and 
we suspect some readers might wonder, “How long did each take?” We make no 
claim of fair testing of speed. That could be the subject of another paper, but 
we report what we observed. 

COSY and INTLAB timings were made on a Toshiba Satellite 4090XDVD 
with an Intel Celeron at 400 Mhz, 128 Mb RAM, running Windows 98, denoted 
by (Win 98) in Table 3. The versions of COSY and INTLAB we tested both run 
in an interpreted mode. The Sun F95 timings were made on a Sun Enterprise 
250, UltraSPARC 3, 1 CPU at 450 Mhz with 512 Mb RAM, denoted by (SPARC) 
in Table 3. The F95 code was compiled, linked, and run. We have not reported 
compile and link times. 

Table 3 reports CPU time for one million evaluations of the Shekel 5 function, 
commonly used to measure a Standard Time Unit (STU) [5]: 

m— 5 ^ 

^ ^ ^ (x - A,)(x - Ai) T + d 

l—l v 7 v 7 
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where A,; denotes the ith row of a given 5x5 matrix A, and c is a given vector 
of length 5. Evaluation of the Shekel 5 function reflects arithmetic operations, 
so we also report CPU time for the evaluation of 

f(x) = log 10 (asin (sin 2 (x) + cos 2 (x) — exp(atan (—a: 2 )))) (1) 

constructed to reflect executions for intrinsic function evaluations. 



Table 3. CPU times in seconds 





COSY 

(Win98) 


INTLAB 

(Win98) 


Sun F95 
(SPARC) 


1 M evaluations of Shekel 5 








Double precision 


92 


410 


25.4 


Interval 


157 


23289 


33.2 


1 M evaluations Equation (1) 








Double precision 


7.3 


142 


2.89 


Interval 


25.4 


41650 


13.58 


2,600 interval test cases 


6.0 


19.1 


0.3 



INTLAB interval times were estimated by timing 10,000 evaluations and 
multiplying by 100. Execution of our interval test cases is dominated by disk 
I/O. In this environment, interpreted COSY is significantly faster than inter- 
preted INTLAB, although recoding either one in a style more appropriate for 
its environment may yield significant improvements. We did not attempt to op- 
timize the performance, preferring to keep the code for the tests as similar as 
possible in each environment. For example, the INTLAB code uses loops rather 
than much faster vector operations. The ratio of interval / real times for COSY 
are comparable with Sun’s F95, and significantly smaller than INTLAB. Results 
in other environments may be markedly different. 

Regarding tightness and speed, Martin Berz responds to the results of our tests, 
“COSY is designed on the two premises of portability across platforms on 
the one hand, and use within the Taylor model framework on the other. The 
desired portability is achieved by building interval intrinsics based on F77 in- 
trinsics, with the necessary safety factors of around four ULP’s because of the 
inherent precision (or rather lack thereof) of the intrinsics. The use in the Tay- 
lor model framework entails that in practically relevant calculations, these slight 
overestimations usually do not matter since the Taylor model approach is used 
for large domain intervals where because of dependency, conventional validated 
methods usually have much larger overestimations in all but the simplest cases. 
Furthermore, since the vast majority of effort in the Taylor model arithmetic lies 
in the floating point coefficient arithmetic which is highly optimized in COSY, 
the efficiency of the interval implementation is of secondary significance.” 





102 



G.F. Corliss and J. Yu 



We repeated our tests replacing the default safety factor in COSY for inflation 
of F77 intrinsics by an inflation of one ULP at each end. We observed reduced 
excess widths and no further violations of containment. 



9 Testing COSY’s Taylor Model Arithmetic 

After testing COSY’s interval arithmetic, we turned to its Taylor model arith- 
metic. Revol et al. [12] provide mathematical proofs that the algorithms in COSY 
for multiplying a Taylor model by a scalar and for adding or multiplying two 
Taylor models return Taylor models satisfying the containment property. We 
performed broader, execution-based testing. Revol’s proof of the algorithm and 
our execution-based testing are complementary. The proof is more general than 
a (large) collection of test cases in the sense that test cases can demonstrate the 
existence of an error, but cannot demonstrate absence of errors. Our execution- 
based tests might discover implementation errors of a correct algorithm, and we 
covered operations and intrinsic functions Revol did not consider. 

Given an interval vector [x] and an expression /(x), a Taylor model TMf is 

1. p(x), a polynomial in x with floating-point coefficients, and 

2. [/], an interval 

such that f(x) £ TMf(x ) = p(x) + I for all x £ [x]. The goal of our execution- 
based testing was to find examples for which containment of point evaluation 
failed, i.e., x £ [x\ for which /(x) is not in TMf(x). We did not consider the 
weaker range bound test: f([x]) £ TMf([x}). By inclusion monotonicity, if f(x) £ 
TMf(x ) for all x £ [x\, then f([x}) £ TMf([x]). The point evaluation challenges 
might discover an error which could be masked by even slight interval over- 
estimation in the interval evaluation challenge. 



9.1 Verification Process 

COSY’s Taylor model arithmetic can be verified using COSY’s interval arith- 
metic to verify COSY’s Taylor model arithmetic. All the comparison is done 
inside COSY. Alternatively, we can use Maple as a referee. Both of the tests are 
rigorous. The second test might detect containment failures the first one does 
not, but it is difficult to communicate the required information to Maple. We 
would have to communicate sparse structure of the Taylor model and binary 
values of its coefficients. The first test is much faster, and it is the approach we 
used. 

Taylor Model Verification: 

1. Evaluate the function / over the domain [x] . 

For example: / = cos(3.14 + 1.57 * x) on [x] = [—1, 1]. 

2. Construct the Taylor model expression of / (TM EXPR) in COSY. 
TMfEXPR := COS (-3. 14 * TM_ONE + (1.57 * TM_ONE) * TMJNDEP); 
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TM_ONE is Taylor model for the constant ONE. It is used to convert con- 
stants such as -3.14 and 1.57 into Taylor models. 

TMJNDEP is a Taylor model for the independent variable. 

3. Construct the interval expression of / (INL_EXPR) in COSY. 

IVLJ3XPR := COS (INTV (-3. 14, -3.14) + INTV(1.57, 1.57) * VAR1); 
VAR1 is the interval independent variable. 

4. Choose a point 2 £ [a;] and convert it to a tight interval [z] using COSY’s 
interval constructor. 

5. Evaluate the polynomial part of the Taylor model expression (TM_EXPR) 
on the tight interval ([ 2 ]) and add the remainder bound. 

6. Evaluate the interval expression (IVL_EXPR) on the tight interval ([ 2 ]). 

7. Compare the results of 5) and 6). 

If the intervals are disjoint, there is an error. 

9.2 Testing Scope 

We designed test cases to evaluate the COSY operations of , x, sin, cos, 
tan, asin, acos, atan, sinh, cosh, tanlr, log, exp, sqrt, sqr, isqrt, and unary + and 
— . Taylor model operations combine their operand polynomials and interval 
remainder bounds using floating point arithmetic to the extent possible and 
guaranteeing that the resulting Taylor model preserves containment. We tested 
Taylor models with both general domains for the independent variables and 
domains normalized to [—1, l] n at dimension 1 (13 expressions): order 1, . . ., 20; 
dimension 2 (20 expressions): order 1, . . ., 18; and dimension 7 (21 expressions): 
order 1, 2, 3, and 4. “Dimension” denotes the number of independent variables, 
and “order” is the order of the Taylor model polynomial. The Taylor models were 
challenged at the corner points of n-dimensional boxes and at a few interior 
points. As for the interval tests, we expect errors to be most visible at the 
boundaries. Here is pseudo-code for these tests: 

Loop for general and normalized domain 

Dimension = 1; Loop for order = 1, . . ., 20 
Loop for 9 challenge points 

Loop for Taylor model 1 ... 13 
Pass to 149 unary operations 
Pass to 69 binary operations 
Dimension = 2; Loop for order = 1, . , ., 18 
Loop for 25 challenge points 

Loop for Taylor model 1 ... 20 
Pass to 149 unary operations 
Pass to 69 binary operations 
Dimension = 7; Loop for order = 1, 2, 3, 4 
Loop for 256 challenge points 

Loop for Taylor model 1 ... 21 
Pass to 149 unary operations 
Pass to 69 binary operations 
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This represents more than 300,000 Taylor models challenged at a total of over 
14 million points. That test suite required about eight hours on the 400 Mhz 
Intel Celeron machine described in Sect. 8. In constructing test cases, we consid- 
ered order, dimension, normalization, domain, challenge points in the domain, 
sparsity, oscillation, simplicity, and special numbers to create at least one test 
case from each test case equivalence class. We adopted the same philosophy as in 
the interval tests that the test case is the internal binary form of the expression 
constructed from approximate ASCII representations. 

A second test suite used 11 expressions such as 

1. cos(-3. 141592653590006 + 1.570796326794687 aq); 

2. sin(— 4.712388980384691 + 1.570796326794690 aq); 

3. asin (0.0009999999999999983 aq); 

4. asin (-0.4935 + 0.003499999999999997 aq); 

5. asin (0.0004999999999999989 aq + 0.0004999999999999989 x 2 x 5 ); 

Loop for general and normalized domain 

Dimension = 1; Loop for order = 1, 7, 15, 17, 20 
Loop for 8 challenge points 
Loop for expression 1 ... 5 
Dimension = 2; Loop for order = 1, 7, 15, 17, 18 
Loop for 25 challenge points 
Loop for expression 1 ... 9 
Dimension = 7; Loop for order = 1, 2, 3, 4 
Loop for 256 challenge points 
Loop for expression 1 ... 11 

This represents 228 Taylor models challenged at more than 25,000 points. 
This test required about 90 seconds and disclosed violations of containment in 
sin and cos and in asin and acos. 

9.3 Containment Error in sin and cos 

We found a violation of containment error in sin and cos (examples 1 and 2 
above) in arguments of dimensions 1 and 2 with order 17 at aq near -1. 

Cause: In the test environment, integer arithmetic used internally by COSY 
overflows and wraps from positive to negative with no alert, warning, or trap. 
Solution: Replace some integer arithmetic in the sin and cos modules by double 
precision. The remaining COSY code was carefully scanned to be sure there were 
no similar use of integer arithmetic. The May 2, 2003, version of COSY runs the 
test cases as expected. 

9.4 Containment Error in asin and acos 

We found several violation of containment errors in asin (examples 3-5 above) . 
Cause: In one case in the asin module, some coefficients were multiplied by [0, h] 
instead of [— h, h ]. 

Solution: Correct the coding error. The May 2, 2003, version of COSY runs the 
test cases as expected. 
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10 Conclusions and Extensions 

Testing software of this complexity is itself a complex task. One needs to develop 
test cases that distinguish subtle errors. For interval packages, one must present 
to the software under test cases free from possible roundoff, and one similarly 
must guard against roundoff in specifying the expected result. 

Effective testing of interval and Taylor model arithmetic in COSY is diffi- 
cult because the conservative outward rounding of interval arithmetic can mask 
subtle errors. Simple test cases were successful (found errors) where more com- 
plicated tests had failed. For example, we found Taylor model errors in sin and 
in asin, although extensive sin(asin(x)) and asin(sin(a;)) tests had passed. Sim- 
ilarly, asymmetric tests seemed to be more powerful. The error in sin and cos 
appeared only for order 17 because the remainder has the form [0, <5] rather than 
[ — <5, <5] . The error is present in other orders, but it is hidden by slight excess 
widths introduced by repeated outward roundings. 

Although our test suites for both interval and Taylor model arithmetics are 
large, they are neither comprehensive nor exhaustive. For example, one might 
port Gonnet’s floating point tests from 

www. inf . ethz . ch/personal/gonnet/FPAccuracy/Analysis .html. Gonnet’s is 
a demanding test for the accuracy of double precision intrinsic functions. He 
uses challenge points known to be problematic or for which evaluation values 
are known to be problematic. Gonnet’s additional values may disclose errors in 
interval or Taylor model evaluation. 

Execution based testing cannot show the absence of errors, but can only 
demonstrate their presence. While we prefer to see no errors in our programs, 
especially in programs that claim to compute with guarantees, we think it speaks 
well of the authors of the COSY, Sun F95, and INTLAB packages we tested that 
we found relatively few errors. We cannot guarantee that they are now error-free, 
but our tests should appreciably raise the level of confidence in their reliability. 

Complete software for the testing reported here is available from 
www . eng .mu . edu/ corlissg/Pubs/COSYtest. 

We encourage users of COSY and most other software packages 
to check author/vendor web sites regularly for possible updates and 
patches. 
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Abstract. This paper is about guaranteed nonlinear parameter and 
state estimation. Sets are computed that contain all possible values of the 
parameter (or state) vector given bounds on the acceptable errors. The 
main requirement is that the dynamical equations describing the evo- 
lution of the model can be bounded between cooperatives models, i.e., 
models such that the off-diagonal entries of their Jacobian matrix remain 
positive. The performances and limitations of the techniques proposed 
are illustrated on a nonlinear compartmental model. 



1 Introduction 

Parameter and state estimation problems are encountered when modeling a pro- 
cess that involves uncertain quantities to be estimated from measurements. 

Consider a system with known input vector u (t) and output vector y(f). 
Assume it is described by a model with the same input and consisting of a 
dynamical state equation 

x' (t) =f (x(t),p,w(t) ,u(t)) , (1) 

with initial condition 

x (0) = x 0 (p) , (2) 

and an observation equation 

y m (x (t) , p, t) = g (x (t) , p) + v (t) , (3) 

where the vector x is the state of the model, x' is its derivative with respect 
to time, p is a vector of unknown parameters and w and v are vectors of state 
perturbations and measurement noise. State perturbations account for the fact 
that (1) is only an approximation of reality. Measurement noise is introduced 
in (3) to represent the imperfection of the sensors measuring the outputs of the 
system. 

R. Alt et al. (Eds.): Num. Software with Result Verification, LNCS 2991, pp. 107-123, 2004. 
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Finding an estimate p for p such that the output of the model y m (x (t) , p, t) 
is an acceptable approximation of the output of the system y ( t ) is called pa- 
rameter estimation. Similarly, finding an estimate x ( t ) for x ( t ) is called state 
estimation. When the two problems are solved simultaneously, one speaks of 
joint parameter and state estimation. 

This paper focuses on bounded-error estimation. In this context, it is as- 
sumed that the perturbations and noise are bounded with known bounds and 
one looks for the set of all parameter (or state) vectors that are consistent with 
the experimental data and these bounds. Specific methods are available for the 
case where the model output is linear in the parameter vector (or the initial 
state vector), and we shall concentrate on the more difficult nonlinear case. 

The first part of this paper deals with recursive state estimation of a 
continuous-time model assuming that the system output is measured at dis- 
crete time instants. An idealized algorithm is presented first. As the Kalman 
filter, it alternates prediction and correction. The prediction step computes the 
evolution of the set corresponding to the state estimate. The correction step 
takes place as soon as a measurement of the system output becomes available. 
It computes the intersection of the previously calculated set with the set of all 
state vectors that are consistent with this measurement and the bounds on the 
measurement error. 

The second part of this paper deals with parameter estimation. It is much 
shorter since the same type of tools are used as in the correction step of state 
estimation. 

The specific difficulty when estimating the parameter or state vector of a 
continuous-time state-space model is that most often there is no closed- form so- 
lution of the differential state equation, which complicates the required obtain- 
ment of an inclusion function for this solution. Guaranteed interval integration 
could in principle be used, but it becomes notoriously pessimistic as soon as the 
uncertainty in the parameters and initial conditions is large, as required here. 
We shall see that a much less pessimistic numerical inclusion function for the 
model output can be evaluated if the differential model can be enclosed between 
two cooperative systems. 

State and parameter estimations are illustrated with compartmental models, 
widely used in biology. 



2 Recursive State Estimation 

2.1 Introduction 

In this section, an estimate x ( t ) for x ( t ) is to be obtained such that 
y m (x ( t ) , p, f), the output of the model (1) — (3), is an acceptable approximation 
of the output y ( t ) of the system. 

Note that the parameter vector p is not necessarily known. Two approaches 
may be considered to estimating x (f) when p is uncertain. A first method as- 
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sumes some prior knowledge about the evolution of p, described by the differ- 
ential equation 

P' (*) = f P (x (t) , P (t), w p (t) , u (t)) , p (0) = po, (4) 

where w p plays the same role as w for x. (If the parameters are assumed to be 
constant, the differential equation in (4) boils down to p' (t) = 0.) Defining an 
extended state vector 

x e (t) = (x T (f),p T ( t )) T , 

makes it possible to obtain from (1) and (4) an extended dynamical state equation 
f x (<) V = / f ( x (f),P, w (f), u (f)) \ /x(0)\ /x 0 (p 0 )\ 

V f p( x W>pW> w p W> u W)/ ! ' vp(°)y v po y 

or equivalently 

(x e (t))' = f e (x e (t) , w (t), w p (t) , u (t)) , x e (0) = xg (p 0 ) • 

With this approach, which corresponds to joint parameter and state estima- 
tion, the distinction between state variables and parameters disappears, and the 
situation is formally equivalent to the case with no uncertain parameters. 

A second method, which is the one to be employed in this paper, integrates 
the uncertainty about p in the state perturbations and measurement noise. No 
attempt will then be made at estimating p, which will be considered as a nuisance 
parameter vector. 

When f and g in (1) and (3) are linear functions of the state vector and when 
moreover the perturbations and noise are additive and receive a probabilistic 
description by their means and covariances, Kalman filtering [15] is the standard 
approach to state estimation. In the context of bounded errors, many tools are 
also available, see, e.g ., [2], [20] and [23]. 

In a nonlinear context, the methodology is far less developed. When uncer- 
tainty is explicitly taken into account, this is most often by using an extended 
Kalman filter [4] based on the linearization of (1) around the state trajectory. It 
is well known that this type of filter may fail to produce a useful estimate of the 
state vector, and that the characterization of the uncertainty in this estimate is 
not reliable. 

Guaranteed state bounding is an attractive alternative, which has been con- 
sidered in a discrete-time context in [11] and [18]. All state vectors consistent with 
the data, model and bounds are enclosed in a subpaving , consisting of a union of 
disconnected boxes. In a continuous-time context, a state estimator for models 
such as that described by (1) and (3) was proposed in [10] but with no state 
perturbation or parameter uncertainty taken into account. Techniques bounding 
the state of continuous-time systems with poorly known state equations and in- 
puts are presented in [1] and [6], with applications in waste processing. Provided 
specific assumptions are satisfied by the signs of the entries of di / 3x, interval 
observers can be built. An interval observer is a pair of classical point observers 




110 



M. Kieffer and E. Walter 



computing a box enclosure of the state x at any given time based on lower and 
upper bounds for each of the uncertain variables. 

In this paper, interval observers and the recursive state estimation algorithm 
presented in [12] and [18] are combined to enclose the state x (i) of the model 
(1) — (3) at any given instant of time t in a subpaving. This is performed recur- 
sively and can thus be implemented in real time. Preliminary results have been 
presented in [19]. 

An idealized algorithm is first proposed in Section 2.2. An implementable 
counterpart of this algorithm is then described in Section 2.3. The advantages 
and limitations of the approach are illustrated on an example in Section 2.4. 

2.2 Idealized Algorithm 

Consider the model (1) — (3) and a set of sampling instants T = {ti} igN », suc h 
that > ti, at which the measurements y (U) have been collected. Initially, 
x (0) is only known to belong to some box [x 0 ]. The vector p of uncertain pa- 
rameters is assumed to be constant and to belong to some known prior box [po] ■ 
The state perturbation w ( t ) is assumed to satisfy w (t) < w ( t ) ^ w (t) at any 
t ^ 0, where [w (f)] = [w(f) , w(f)] is known for all t and the inequalities are 
to be understood componentwise. The measurement noise v (ti) is similarly as- 
sumed to belong to [v (t,)] = [v (t, ; ) , v (t*)] , known at each tj. The information 
I (t) available at time f ^ 0 is given by 

l(t) = {[xo],[po],{[w(T)],u(r)} Te[M ,{[v(t i )]}^ 1 |, (5) 

where Im is such that tM ^ t < tM+i- In tins context, causal state estimation 
is the caracterization of the set X ( t ) of all values of the state x ( t ) at any time 
t ^ 0 that are consistent with I (t). 

As in the Kalman filter, the idealized recursive causal state estimator consists 
of two steps. 

For the prediction step , assume that X (ti) is some set guaranteed to contain 
x(U). For any given x £ X (ti) , let ip (x, t, U, p, {w (r) , u (r)} re[t . it ]) be the 
value at time t. of the flow associated with (1) that coincides with x at time ti. 
Define the predicted set X + (f, + i) as 

X + (t i+1 ) = {ip(x,t i+1 ,ti,p,{w ( t ) ,u(T)} Te[tiiti+l] ) 

I P e [p 0 ] ,w(r) £ [w (t)] , x £ A (ti) , t £ [U,t i+ 1 ]} - (6) 

By construction, x (U + i) £ X + (tj+i). 

Now, for the correction step , let [y (fi+i)] be the box containing all possible 
values of the noise-free output when the value of the measured output is y (f,:+i) 

[y (i i+ i)] = y (t i+ 1 ) - [v (t i+1 )] , (7) 

and let X° (ti+i) be the set of all values of the state at time t l+ \ that could have 
led to an observation y in [y (tj+i)] 

X° (t i+ 1 ) = {x £ IT | g (x, p) £ [y (f i+ i)] , p £ [p 0 ]} . 



(8) 
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Then, the corrected set 



x{t i+1 ) = x+ Mnr (t l+1 ) 



(9) 



is also guaranteed to contain x (t, + 1 ) (see Figure 1). 






Vliti+l) 



Fig. 1 . Idealized state estimation 




This is summarized in the following idealized algorithm. 

Algorithm 1 

For i = 0 to N — 1, do { 

1. Prediction : evaluate X + (ti+i) ; 

2. Correction : A (f i+ i) = X + (U + i) D X° (t,;+i) ; } ■ 

It is easy to show [16] that X (t) as evaluated by Algorithm 1 is the smallest 
set guaranteed to contain x (t) that can be computed from X (f) and (1). The next 
section presents the basic tools required to obtain an implementable counterpart 
to Algorithm 1. 



2.3 Implementation Issues 

To obtain an implementable counterpart to Algorithm 1, three main problems 
have to be solved. 

The first one is to represent the sets X ( ti ) , X° ( ti ) and X + ( ti ) in computer 
memory. In this paper, the description of sets using subpavings presented in [17] 
is used. 
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The second problem is the evaluation of X° (U+i) during the correction step 
(8). An outer approximation X° (f,:+i) of X° (f,:+i) by a subpaving can be ob- 
tained using the SlViA algorithm (see below). The precision of this outer ap- 
proximation is controlled by a precision factor eg. 

The remaining problem is the solution at the prediction step of the set of 
IVPs required to evaluate X + (fi+i). Standard guaranteed tools are available to 
solve IVPs such as |x' = f (x,t) ,x(0) = x} or |x' = f (x,t) ,x(0) £ [x]}, see, 
e.g., AWA ([21], [22]), COSY ([8], [9]) or VNODE ([24]). These techniques use 
Brouwer’s fixed-point theorem to show the existence of a solution and build 
a Taylor expansion of the solution while bounding the remainder. However, 
they become very inefficient in the presence of unknown parameters or bounded 
state perturbations because the bounds on the remainder soon become extremely 
large. We shall present a more efficient approach, based on cooperativity. 



Sivia. Using interval analysis, it is possible to provide inner and outer approx- 
imations of the set X° (f, ;+1 ) defined by (8), using the algorithm Sivia (for Set- 
Inverter Via Interval Analysis , see [13] and [14]) briefly recalled here. 

An initial bounded search set X° guaranteed to contain X° (ti+i) has to be 
provided first. Sivia partitions X° into three subpavings, namely X; n contained 
in X° (tj + i) , X out such that its intersection with X° (f i+1 ) is empty and Xbound 
for which no conclusion could be reached. 

Consider a box [x] C X° and let [g] (.) be an inclusion function for g (.). 

1. If [g] ([x] , [p 0 ]) C [y (U+i)], then for any x £ [x] and p £ [p 0 ] , g (x,p) £ 
[y (f,;_|_i)] and [x] is entirely included in X° (f i+ i); it is thus stored in X in . 

2. If [g] ([x] , [p 0 ]) n [y (tj+i)] = 0 , then g ([x] , [p 0 ]) n [y (U+i)] = 0 and [x], 
proved to have an empty intersection with X° (U+i), can be stored in X out . 

3. If neither of the previous tests is satisfied, then [x] is undetermined. If the 
width of such an undetermined box is larger than the precision factor eg, 
then it is bisected into two subboxes [xi] and [x 2 ] to which the same tests 
are applied. Undetermined boxes that are too small to be bisected are stored 
into X. bound ■ 

X° (fi+i) is thus bracketed (in the sense of inclusion) between Xi n and 
X° (<i+i) = Xi n U Xb oun d. The volume of the uncertainty subpaving Xbound 
may be reduced, at the cost of increasing computational effort. Note that there 
is actually no need to store X out . 



Inclusion functions based on cooperativity. This section aims at defining 
an implementable procedure for computing X + (U+i) defined by (6) based on 
the concept of cooperativity [26]. 

Definition 1 . A dynamical system 



x' = f (x, t) 
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is cooperative over a domain V if 
df 

— — (x, t) ^ 0, for all i ^ j, t ^ 0 and x £ V, 
dxj 

i.e., if all off-diagonal entries of the Jacobian matrix of f are non-negative for 
all t ^ 0 and x£P. 

The following theorem, which is a reformulation of a result in [26], will be 
used to obtain an enclosure for x (t) in (1). This enclosure will be instrumental 
in the implementation of the prediction step. 

Theorem 1. If there exists a pair of cooperative systems 

x' = f (x,p, p, t) and x' = f (x,p,p, t) (10) 



satisfying 



x 0 < x (0) < x 0 



and 

f (x,p, p, t) < f (x, p, w, u) < f (x,p, p, t) , 

for all p £ [p,p], w (t) £ [w(t),w(f)], t > 0 and x £ V then the state of (1) 
satisfies 



x (t) < x ( t ) < x (f) , for all t > 0, 

where x(t) = 0 (x 0 ,p, p, f) is t/ie flow associated with 

|x' = f (x,p,p, t) ,x(0) = x 0 } an d x(t) = 0 (xqjPjPj 0 flow asso- 
ciated with |x' = f (x,p, p, f) , x (0) = xo} ■ ♦ 

For any t ^ 0, the box- valued function 

[0] (x 0 ,x 0 ,p,p,t) = [0(x o ,P,P,t) ,0(x O ,P,P,i)] 

is thus an inclusion function for x (i), the solution of (1). However, this function 
is difficult to evaluate, as usually no explicit expressions are available for </>(.) 
and 0(.). Interval analysis provides tools for computing guaranteed outer ap- 
proximations of the solution of initial value problems, see, e.g., [24]. Using these 
techniques, it becomes possible to compute tight enclosures of 0 (x 0 , p, p, t) and 
0 (xo,p,p,i) as 

\fjf (x 0 , p, P, t)] = 0(x o ,p, p, t) , 0 (x Q , p, p, t) 
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is thus such that 

V (x, t, 0, p, {w (r) , u (r)} Te[0it] ) G [[<£]] ([x 0 ] , [p] , t) , 

for x 0 G [xo], p G [p, p] , w (t) G [w (t ) , w (f)], t ^ 0, and is therefore an inclusion 
function for the solution x (t) of (1), which can be numerically evaluated for any 
t > 0. 

Let <P ([xo] , [po] , t) be the set of all x (t) that can be traced back to an initial 
condition in [xo] according to (1) with a parameter vector p G [po]. Then if the 
conditions of Theorem 1 are verified 

x (t) G # ([x 0 ] , [po] , t) C [[$] ( [x 0 ] , [p 0 ] , t) for any t > 0. 

Interval observers using [[</>]] ([x 0 ] , [po] , t) are only able to provide a box contain- 
ing # ([x 0 ] , [po] , t) . However, <P ([x 0 ] , [po] , t) is usually not a box, see Figure 2. 
Here, we propose to improve the accuracy of the approximation of ([xo] , [po] , t) 
by enclosing it in a subpaving using the ImageSp algorithm presented in [17] 
and [18] and briefly recalled now. 



x 2 (0) x 2 (t) 




Fig. 2. State estimates obtained with an interval observer (box in dashed lines) and 
an approximate set observer (union of light grey boxes on the right) 



The algorithm ImageSp consists of three steps. First, [xo] is minced , i.e., 
divided into boxes of width less than a given precision factor ej. Then, the images 
of all these boxes are evaluated using an inclusion function of d> and stored into 
a list C of image boxes. Finally, all boxes in C are merged to obtain a subpaving 
guaranteed to contain ([x 0 ] , [po] , t). The time needed to obtain this subpaving 
and the precision of the description (measured, e.g., using a Hausdorff distance 
to the approximated set) increase when the precision factor si decreases. 

The only requirement for ImageSp is the availability of an inclusion function 
for which is obtained using [[</>]]. 
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Remark 1. In the previous presentation, only [xo] has been minced, but if one 
considered the extended state vector x e (t) = (x T (t ) , p T ) T , with initial condi- 
tion Xq € [xg] = ([x 0 ] T , [p 0 ] T ) T , the mincing could have been performed on [xg]. 
When [po] is a non-degenerate interval, the resulting enclosure is usually more 
precise, but obtained with an increased computational effort. 



Implementable algorithm. Assume that X (t) has to be evaluated, with t 
such that t = tjy and that X (0) = [xo]. The following algorithm is a counterpart 
to Algorithm 1. 

Algorithm 2 

For i = 0 to N — 1, do { 

1. Prediction: evaluate X + (fi+i) using ImageSp; 

2. Correction: evaluate X (fi+i) using SlViA with initial search domain 

X + (U+ 1 ); } ■ 

Convergence properties have been established in [14] for SlViA and in [18] for 
ImageSp. The convergence of Algorithm 2 depends not only on es and £i, but 
also on the quality of the enclosure of (1) provided by the pair of cooperative 
systems. 



2.4 Example 

Compartment models are frequently used in pharmacokinetics, chemistry or bi- 
ology. They consist of a collection of tanks containing material. These tanks, 
represented by circles exchange material between them and with the rest of the 
world, as materialized by arrows. Each tank is supposed to be homogeneous 
and the quantity of material in compartment i is denoted by Xi. Many types of 
compartment models are available (see, e.g ., [5]), but all share the property that 
the evolution of the quantities of material in the compartments is governed by 
a state equation that may easily be enclosed between cooperative models. 




Fig. 3. Two-compartment model 



To facilitate presentation, we shall consider a simple academic example, the 
structure of which is nevertheless typical of nonlinear compartmental models. 
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Assume that the evolution of the vector of quantities of material x = ( X\,X 2 ) 
in the compartments of the model of Figure 3 is given by 



/ P 1^1 

Xi = - — P3*l + PiX 2 + U, 

l+p 2 Xi 

, P1X1 



-i , -P4X2, 

1 + P2X1 

and that only X 2 is measured, according to the measurement equation 

y ( U ) = (1 + ei) x 2 ( U ) , 



(12) 



(13) 



where ei is bounded. 

All parameters are supposed to be known except for pi £ [pi] = [0.9, 1.1] and 
the initial state of the system is only known to belong to [x 0 ] = [0,1] x [0,1]. 
Data have been simulated with the actual value of the parameter vector p* = 
(1,4/3, 1/2, 1/4) t , Xq = (0,0) T and 

( u (t) = 1 when 0 < t < 1 and 2.5 < t < 3.5 
( u (t) = 0 elsewhere. 

At 20 regularly-spaced time instants from 0.5s to 10s, a measurement of X 2 is 
taken and corrupted by a bounded relative noise ei € [—0.1, 0.1]. The problem 
is to determine the set of all values of the state vector that are consistent with 
the model, the measurements and their uncertainty. 

The dynamical model (12) can be bounded by the two models 



and 




PlXl 
1 + P 2 X 1 



P3X1 +p\x 2 + u , 



P X X 1 
1 +P 2 XI 



-P4X2, 



(14) 




Pi Xl 

1 +pi,xi 



- P 3 X 1 + p* 4 x 2 + u , 



P\Xl 
1 +P2X 1 



-P4X2, 



(15) 



which are easily proved to be cooperative, as the vector of quantities of material 
is positive. Moreover, as [xo] = [0, 1] x [0, 1] , the conditions of Theorem 1 are 
satisfied. Thus, the prediction part of the recursive state estimation algorithm 
presented in Section 2.3 can be implemented using an inclusion function built 
from (14) and (15) and evaluated by guaranteed numerical integration. 

The correction step involves the SlViA algorithm presented in Section 2.3. 
The bounds for v (£;) in (3) are computed knowing that 



V ( U ) = g (x* (U) , p*) (1 + ei C k )) 
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where ei {ti) is a realization of a random variable with support restricted to 
[—0.1, 0.1] and 

g(x,p) = x 2 . 



Thus 

and 



y (U) G g (x* (U) , p*) (1 + [-0.1, 0.1]) 



g (x* {ti ) , p*) G 
G 



1 + [- 0 . 1 , 0 . 1 ] 

(1 + [-0.081, 0.112 })y{ti) 



At each measurement time, the measurement noise is thus known to belong to 
[«(*<)] = [—0.081, 0.112] y {ti) . 

All resulting intervals guaranteed to contain the noise-free output of the system 
are represented on Figure 4. 



[j/(*)l 0.9- 
0.8 
0.7 
0.6 
0.5 
0.4 
0.3 
0.2 
0.1 



0 L 

0 



10 



Fig. 4. Intervals guaranteed to contain the true values of *2 {ti) 



Two simulations have been performed, both with the algorithm described in 
Section 2.3, but with differing prediction step. The first one (Case a) is performed 
with a direct guaranteed integration of (12) taking into account the uncertain pa- 
rameters. The second (Case b) involves an inclusion function built with (14) and 
(15). Guaranteed integration has been performed using the VNODE package, 
see [24], The lower and upper bounds of the smallest boxes enclosing the pre- 
dicted and corrected sets are represented for each measurement time on Figure 5 
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Table 1 . Simulation results for recursive state estimation 





Case a 


Case b 


£S = El 


0.025 


0.025 


0.05 


Computing time (s) 


31 


50 


13 


Volume of A (10) 


0.0034 


0.0021 


0.0035 



(for x\, the lower bounds of the predicted sets coincide with the lower bounds 
of the corrected sets). All computations have been performed with £g = £j on 
an Athlon 1800+ and the results are summarized in Table 1. 

In both cases, 90% of the computing time is spent during the first prediction 
step, when the knowledge about the initial value of the state is poor; the last steps 
take less than 0.1 s each. The enclosures obtained in Case b are more accurate 
than in Case a for the same value of eg and and obtained much faster when 
a given final precision is required. These results illustrate the efficiency of the 
bounding approach using cooperative systems. 

3 Bounded-Error Parameter Estimation 

In this section, an estimate p for p is to be obtained such that the output 
y m (x(t),p,f) of the model (1) — (3) is an acceptable approximation of the 
output y (t) of the system. Here, the state vector x (t) , if it is only known to 
belong to a given box [x(t)], plays the role of the nuisance uncertain quantity 
that is not estimated. 



3.1 Introduction 



Standard parameter estimation techniques (see, e.g ., [28] and the references 
therein) compute p as the argument of the minimum of a given cost function, 



e.g., 



j (p) = (y - y m (p)) t (y - y m (p)) , 



where 

y = (y T (H),---,y T ( t N )) T 



and 

y m (p) = (j/m ( x (H ), P, h ),..., (x (t N ) ,p,t N )) T 

are the system and model outputs collected at given time instants U, i = 
1, ... ,N. This minimization can be performed by local-search algorithms such 
as Gauss-Newton or Levenberg-Marquardt, but there is no guarantee of con- 
vergence to a global minimizer of j (p) and this minimizer may even not be 
unique. Random search, using, e.g., simulated annealing or genetic algorithms 
cannot provide any guarantee either that the global minimum has been found 
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Fig. 5. Recursive bounded-error state estimation; lower and upper bounds of the small- 
est boxes enclosing the predicted set (dashed line) and corrected set (solid line); Case a: 
direct guaranteed integration of the model; Case b: guaranteed integration of the bound- 
ing cooperative systems 



after finite computations. Only global guaranteed techniques, such as Hansen’s 
algorithm [7], based on interval analysis, can obtain such guaranteed results. 

Parameter bounding is an alternative approach searching for the set of all pa- 
rameter vectors that are consistent with the experimental data, model structure 
and error bounds. It is similar to the correction step involved in the recursive 
state estimation algorithm presented in Section 2.2. 

3.2 Principle 

With the same hypotheses as in Section 2.2, the parameter vector p G [p 0 ] 
is deemed acceptable if the difference between the output g(x(t,),p) of the 
deterministic part of the model and the experimental datum y (ti) remains in 
[v i7 Vj] for all i = 1, . . . , N. Parameter estimation then amounts to characterizing 
the set P of all acceptable p G [po] 

P={pe [Po] | y (■ ti ) - g (x (ti ) , p) G [y 4 , Vi] , i = 1, . . . , N} . (16) 
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When the observation equation (3) reduces to 

y m (p,i) = h(p,t) +v(t) , (17) 

with h (p, t) some closed-form expression where x (t) does not appear, then the 
way P may be characterized depends mainly on whether h(p,t) is linear in 
p. If it is, P is a polytope that may be described exactly [27] or by an outer- 
approximation for instance using ellipsoids [3], [25]. When h (p, t) is nonlinear in 
p, P is no longer a polytope and may even be disconnected. One may nevertheless 
get a guaranteed enclosure of P using SiVlA. 

When no closed-form solution of the model equations is available, again nu- 
merical integration has to be put at work to compute a box [x (ti)] containing 
the state at each ti in order to enclose g (x (ti) , p). The box [x (£;)] is obtained 
efficiently when (1) can be bounded between two cooperative systems as in Sec- 
tion 2.3. 

The characterization of P is then realized using SiVlA, as presented in Sec- 
tion 2.3. The main difference is that bisection is now performed in p-space instead 
of x-space. 



3.3 Example 

Consider the same example as in Section 2.4, and suppose now that the initial 
state is perfectly known x 0 = (0, 0) T and that only the last two components 
of the parameter vector are known, the first two (pi,P 2 ) being only known to 
belong to [0, 5] x [0, 5]. 




Fig. 6. Parameter estimation; solution subpaving in the (pi,p 2 ) —plane when e = 0.025 



Data have been obtained using the same simulation conditions as in Sec- 
tion 2.4. To evaluate an inclusion function for the state, the two bounding co- 
operative systems are now 
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and 



/ Pi x i * , * 

x 1 = - Ps x l + Pa x 2 + u, 

l+p X! 



Xn = 



P ± X 1 
1 +p 2 Xl 



(18) 



P4X2, 



Ml 

X\ = - — — P* Z X\ + p\x 2 + u, 

1 +p 2 Xi 



p 1 X 1 
1 + P 2 X 1 



(19) 



+> 4 + 2 , 



The problem is now to evaluate the set of all parameter values (pi,p 2 ) that 
are compatible with the collected data and their associated error bounds (see 
Figure 4). The SiVlA algorithm has been used with initial search box [0, 5] x [0, 5] 
in parameter space. Guaranteed integration is again performed with the help of 
VNODE. With £3 = 0.05, the subpaving represented on Figure 6 has been 
obtained in 400 s on an Athlon 1800+ . It contains the actual values of the 
parameters {p*,p 2 ) = (1,4/3). 



4 Conclusions 

This paper presents an alternative and guaranteed approach for parameter and 
state estimation for continuous-time nonlinear differential models in a context 
of bounded errors with known bounds. An outer-approximation of the set of 
all parameter or state vectors that are consistent with the model structure and 
experimental data is obtained. 

The only requirement is that the dynamical state equation of the system can 
be bounded between two cooperative systems. This is the case for all compart- 
ment models and for many other positive systems, i.e., systems for which the 
state and parameters are constrained to remain positive. 

The benefit of the enclosure between cooperative systems has been illustrated 
on an example. An ODE with uncertain parameters is replaced by two bounding 
ODEs with known parameters, the integration of which can be performed much 
more accurately, eliminating the wrapping effect. 
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Abstract. Testing parametric models for identifiability is particularly 
important for knowledge-based models. If several values of the param- 
eter vector lead to the same observed behavior, then one may try to 
modify the experimental set-up to eliminate this ambiguity (which cor- 
responds to performing qualitative experiment design). The tediousness 
of the algebraic operations involved in such tests makes computer alge- 
bra particularly attractive. This paper describes some limitations of this 
classical approach and explores an alternative route based on new defini- 
tions of identifiability and numerical tests implemented in a guaranteed 
way. The new approach is illustrated in the context of compartmental 
modeling, widely used in biology. 



1 Introduction 

In many domains of pure and applied sciences, one would like to build a math- 
ematical model from input-output experimental data. Sometimes, the only pur- 
pose of modeling is to mimic these observations, with no physical interpretation 
in mind. One then speaks of a black-box model. The situation considered in this 
paper is different. It is assumed that some prior knowledge is used to build a 
mathematical model that depends on a vector of parameters to be estimated 
from the data. If the model is entirely based on such a prior knowledge, one 
speaks of a white-box model. This is an idealized situation seldom encountered 
and the model is often a mixture of knowledge-based and black-box parts. One 
then speaks of a gray-box model. For white-box and gray-box models, all or some 
of the parameters receive a physical interpretation, and one would like to make 
sure that these parameters can be estimated meaningfully. 

Let u be the (known) vector of the inputs of the system, which is usually a 
function of time t, and let y (t) be the corresponding vector of the outputs of the 
system at time t. A typical set-up for estimating the vector p of the parameters 
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of the model of this system (see, for instance, [1], [2] or [3]) is to give the system 
and model the same input (one then speaks of a parallel model), and to look for 
the estimate p that minimizes the sum of the squares of the differences between 
the system and model outputs 

/ 

p = argmin V(y(t, ; ) - y m (U, p)) T (y(U) - y m (Ap))- 

p --t 
l—l 

In this equation, the f,s are the instants of time at which the outputs of the sys- 
tem are measured and y m (f, p) is the vector of the outputs of the model at time 
t when the parameter vector takes the value p. The dependence of y and y m 
on the input u is omitted to simplify notation. When p has a physical meaning, 
one would like to know whether finding a numerical value for p gives any indi- 
cation about the actual values of the physical parameters of the system under 
investigation. If not, one may try to modify the experimental set-up in order to 
remove the ambiguity. This is why it is desirable to reach a conclusion as soon 
as possible (if possible before performing any actual experimentation). A partial 
answer is found, under idealized conditions, with the concept of identifiability. 
We shall start by presenting the classical notion of identifiability before pointing 
out some of its limitations and proposing alternative definitions of identifiability 
and a guaranteed numerical method of test consistent with these new definitions. 

2 Classical Approach to Identifiability Testing 

Assume that there are no measurement noise or system perturbations, that the 
input and measurement times can be chosen in the most informative manner and 
that the system is actually described by a model with output y m (h,P*), where 
p* is the (unknown) true value of the parameter vector. Under these idealized 
conditions, it is always possible to find at least one p such that the “system” 
with parameters p* and the “model” with parameters p behave in exactly the 
same manner for all inputs and times, which we shall denote by 

y m (t,p) = y m (i,P*)- (1) 

It suffices to take the trivial solution p = p* for (1) to be satisfied. If this solution 
is unique, then the model is said to be globally (or uniquely) identifiable. This is 
of course desirable. Unfortunately, there may be parasitic solutions. If the num- 
ber of solutions of (1) for p is greater than one, then we know that even under 
idealized conditions it will not be possible to estimate meaningfully all compo- 
nents of p* with a single point estimate such as p. As an illustrative example, 
consider the compartmental model described by Figure 1. Each circle represents 
a tank. The itli tank contains a quantity Xi of material. These tanks exchange 
material between themselves and with the exterior as indicated by arrows. A 
usual assumption in linear compartmental modeling is that the flow of material 
leaving a compartment via an arrow is proportional to the quantity of material in 
this compartment. The constants of proportionality of these exchanges are then 
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Fig. 1. Compartmental model 



parameters to be estimated. Note that even when the compartmental model is 
linear, its output is nonlinear in these parameters, which significantly compli- 
cates their estimation. The dynamical state-space equations associated with a 
given compartmental model are very simple to obtain by writing down mass bal- 
ances for each compartment. Such models, or variants of them, are widely used 
in biology and find applications in other experimental sciences such as pharma- 
cokinetics, chemistry or ecology [4], [5]. For the model of Figure 1, mass balances 
in Compartments 1 and 2 lead to 



and 



d£i 
d t 



~(pi + P 2 )xi + p 3 x 2 + u 



dx 2 
d t 



= PiXi -p 3 x 2 . 



Assume that there is no material in the system at time 0, so x(0) = 0 , and that 
the quantity of material in Compartment 2 can be measured at any positive 
time, so 



ym(Cp) = x 2 (t, p). 

The question we are interested in is as follows: assuming that noise-free data are 
generated by a compartmental model with the structure described by Figure 1 
and parameters p*, can the value of p* be recovered from an analysis of the 
input-output data? 

An obvious difficulty with this question is that the numerical value of p* is 
unknown (since the very purpose of the exercise is to estimate it!), so we would 
like to reach a conclusion that would not depend on this value. Unfortunately, 
this is impossible in general, because there are usually atypical hypersurfaces 
in parameter space for which the conclusion is not the same as for all other 
values of the parameter vector. An example of such an atypical hypersurface 
is the plane defined by p\ =0 for the model of Figure 1. Indeed, if there is 
no flow from Compartment 1 to Compartment 2 then no material ever reaches 
Compartment 2 and y(t) = y m (t, p*) = 0, so there is no information in the 
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system output about an( l P 3 - This is of course pathological and one would 
not use such a model if one had reasons to believe that there is no exchange from 
Compartment 1 to Compartment 2. The existence of such pathological situations 
led to the following usual definition of structural (or generic) identifiability [6] : a 
model is structurally globally identifiable (s.g.i. for short) if for almost any value 
of P* 

y m(t, p) = y m(C P*)=i>p = p*. 

If a model is not s.g.i., then there are several values of p for the same input-output 
behavior, and it is impossible to find out which one of them corresponds to p* 
even in our idealized noise-free experimental set-up. The situation can only get 
worse in the presence of noise or perturbations. Moreover since there are several 
models with the same behavior, there are several ways of reconstructing non- 
measured state variables, e.g ., by Kalman filtering, with different results. So it 
is important to test models for identifiability whenever unknown parameters or 
state variables have a physical meaning or when decisions are to be taken on the 
basis of the numerical values of the estimates of these quantities. 

A typical method of test consists of two steps. The first one is the obtention 
of algebraic equations that p and p* must satisfy for (1) to hold true. For the 
model of Figure 1, it is easy to show that its transfer function is 

T(s) = Pi 

u(s) S 2 + (pi +P 2 +P3)s + P2P3 1 

or equivalently that 

d 2 y Ay 

~^J + lPl+P2+ p 3 ) -j£ + P2P3V = Piu. 

So, for almost any value of p*, (1) holds true if and only if 

( Pi=P*i, 

< Pi + P2 + p 3 = P*i + V*2 + 7*3) 

[ P2P3=P*2P*3- 

The second step is then the search for all solutions of these equations for p. In 
the case of the model of Figure 1, these solutions are the trivial solution p = p* 
and 

( Pi =Pi, 

< P2=P%, 

=P*2- 

The model of Figure 1 is therefore not s.g.i. The roles of P 2 and p 3 can be 
interchanged, and it is impossible to know which is which. Moreover, since there 
are two models with the same input-output behavior, there are two ways of 
reconstructing x\ from measurements of X 2 , even in a noise-free situation, leading 
to different values of X\. Note that the parameter p\ . which takes the same values 
in the two solutions is s.g.i., and recall that most of this analysis becomes false 
if Pi = 0. 
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3 Limitations of This Classical Approach 

Steps 1 and 2 of structural identifiability testing require algebraic manipulations 
that may become exceedingly complicated for models of a more realistic size. 
Both are facilitated by computer algebra [7], but these algebraic manipulations 
may become so complex that they are no longer feasible even on present-day 
computers. Moreover taking into account the fact that only real solutions are 
of interest is still a subject of research with computer algebra. Failing to detect 
that all solutions for p but one are complex would mean failing to detect that 
the parameters are actually globally identifiable. 

Consider, for example, the (static) one-parameter model 

Vm(p) =p(p-l)(p+l). 

Equation (1) translates into 

P(j>~ 1)(P+ 1) = P*{P* ~ 1 )(P* + 1), 

and the set of real solutions for p is a singleton, a pair or a triple depending on 
the value taken by p*. So global identifiability is not a structural property for 
this model. 

These shortcomings call for new definitions of identifiability, first presented 
in [8], 

4 New Definitions and Method of Test 

The parameter pi will be said to be globally identifiable in P (g.i.i.P) if for all 
(p*,p) in P x P, y m (t, p) = y m (t, p*) implies that % = p*. The model will be 
g.i.i.P if all of its parameters are g.i.i.P. With this new definition of identifiability, 
atypical hypersurfaces are no longer allowed in P and unique identifiability can 
be established even if the model is not structurally globally identifiable. It makes 
sense to study identifiability in a specific region P of parameter space, if only 
because some information is usually available on the sign and possible range for 
each physical parameter. 

It does not suffice to have realistic new definitions of identifiability, methods 
of test are also needed. A model will be g.i.i.P if and only if 

^(p*,p) G P x P such that y m (t,p) = y m (t,P*) and ||p - p*^ > 0. 

In practice, it will usually suffice to prove that 

$(p*, p) G P x P such that y m (t,p) = y m (f, p*) and ||p* - p^ > 6, 

where S is some small positive number to be chosen by the user. The model will 
then be said to be 5-g.i.i.P. Testing whether a model is d-g.i.i.P boils down to 
a constraint satisfaction problem (CSP). The algorithm SIVIA, combined with a 
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forward-backward contractor, can be used to bracket the solution set § of the 
CSP 



(p*,p) € P x P, y m (i,p) = y m (t p*), ||p* -pIU > $ 
between inner and outer approximations: 

ScScS. 

If S is empty, then the model is 5-g.i.i.P. If § is not empty, then the model is not 
5-g.i.i.P. Details about SIVIA can be found in the paper by Kieffer and Walter in 
this volume and in [9], where forward-backward contractors are also presented. 

5 Benchmark Example 

The model of Figure 2 could serve as a benchmark example. It has been proposed 
to describe the distribution of drugs such as Glafenine in the body [10], [11] 
after oral administration. Compartment 1 corresponds to the drug in the gastro- 



u 




Fig. 2. Model of the distribution of Glafenine 



intestinal tract, and Compartments 2 and 3 respectively correspond to the drug 
and its metabolite in the systemic circulation. The state equation of this model 
is 



d: Ei 




dt 




dx 2 




dt 




dx 3 
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By measuring the plasma concentration of the drug and its metabolite, the 
quantities of drug in Compartments 2 and 3 are determined up to unknown 
multiplicative constants, so 



y m (Cp) 



( Pex 2 (t)\ 
\P 7 X 3 {t ) ) ’ 



where p 6 and py are respectively the inverses of the volumes of Compartments 2 
and 3. The dimension of the parameter vector p is thus seven. The corresponding 
transfer matrix is trivial to obtain by taking the Laplace transform of the state 
and observation equations and then eliminating the state variables. The same 
approach as in the introductory example of Section 2 can then be used to obtain 
a set of nonlinear equations in p and p* that are equivalent to 



y m (t,p) = y m (t,p*)- 



These equations can be written as 



PiPe = P*iPl 



P 2 P 7 = P2P7 



P7 (P1P3 + P2P3 + P2P5) = pX (P1P3 + P2P3 + P2P5 ) 



Pi + P2 + P 3 + P 5 = Pi + Pi + P3 + P 5 



PlP 3 + PlP 5 + P 2 P 3 + P 2 P 5 = P*lP*3 + P1P5 + P* 2 P *3 + P*2P*5 



Pl+P 2 +P 3 +P 4 + P 5 =Pt+pX+Pt+Pt+ Pi 



PlP 3 + PlPi + PlP 5 + P 2 P 3 + P 2 P 4 + P 2 P 5 + P 3 P 4 + P 4 % = 

jk >k 1 '-k 'M 1 'M 'M 1 ;k >k i 'M :4c i ;k i 'M :4c i :4c 

P1P3 + PlP4 + PlP5 + P2P3 + P2P4 + P2P5 + P3P4 + P4P5 



P 4 {PlP 3 + PlP 5 + P 2 P 3 + P2P5) 



= 4 = / * 1 **1 **1 * \ 

P4 (P1P3 + PlP5 + P2P3 + P2P5 ) 



Their obtention is facilitated by the use of computer algebra. 

We said in [8] that this model was 5-g.i.i.P for P = [0.6, l] x7 and S = 10 -9 , 
but this remains to be confirmed, as this result may have been obtained with an 
incorrect software. 
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6 Conclusions 

The concept of identifiability is important whenever physically meaningful pa- 
rameters or state variables are to be estimated from experimental data. Testing 
models for structural global identifiability is not always possible, even with the 
help of computer algebra, and when a conclusion can be reached, it is not always 
relevant. This has led us to propose new definitions of global identifiability in a 
domain of parameter space. With this definition, it is possible to prove identifi- 
ability even in cases where the parameters are not structurally identifiable. The 
tests are performed via interval constraint satisfaction programming, with the 
use of contractors to avoid bisection as much as possible, thereby reducing the 
effect of the curse of dimensionality. We hope to have convinced the reader that 
identifiability testing is both a useful part of model building and an interesting 
challenge for interval analysts. 

In this paper, it was assumed that there was a single model structure to be 
considered for the description of the data. When several model structures are in 
competition, a natural question to ask is whether it will be possible to select one 
that is more appropriate than the others. This question can be answered in the 
same type of idealized setting as considered for identifiability and corresponds 
then to the notion of distinguishability. The methodology advocated here for 
testing models for identifiability readily extends to the test of model structures 
for distinguishability. 
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Abstract. We will show how a variety of interval algorithms have found 
their use in the multibody modeling program MOBILE. This paper ac- 
quaints the reader with the key features of this open source software, 
describes how interval arithmetic help to implement new transmission 
elements, and reports on interval modeling of dynamics, which is an 
inherent part of multibody simulations. In the latter case, the interval 
extension of MOBILE enhanced with an interval initial value problem 
solver (based on VNODE) is presented. The functionality of this appli- 
cation is shown with some examples. We provide insights into techniques 
used to enhance already existing modeling software with interval arith- 
metic concepts. 



1 Interval Arithmetic in MOBILE: Areas of Application 
and Integration Strategies 

Interval arithmetic is often criticized for its inapplicability to real life problems. 
This work claims the contrary by showing how it can be employed in multibody 
systems’ modeling, an important area of applied physics, and in particular, in 
the program MOBILE. Interval arithmetic is used here to not only ensure the 
validity of the obtained results, but to also provide new modeling opportunities. 

Mechanical interactions are usually modeled with the help of differential 
equations. It would be very time consuming to manually make up these equa- 
tions each time. For that reason various types of modeling software have found 
a market in industry. Usually, such software produces the respective system of 
differential equations from the (formalized) description of an arbitrary mechan- 
ical system and is also capable of solving it thus characterizing the necessary 
system’s properties. 

In the present context, we employ the multibody library MOBILE described 
in [1,2]. It is able to model arbitrary mechanical systems and is characterized by 
its high computational speed (section 2 of this paper describes this program in 
more detail). 

In the process of solving different problems with MOBILE, new tasks pre- 
sented themselves, some of which proved to be most effectively dealt with by ap- 
plying interval techniques. As a simple example of such a task, the incorporation 
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of some external measurements as parameters into a model can be considered. 
Measurements are usually performed with a (small) error, the influence of which 
on the system’s behavior is sometimes of interest. Moreover, the models always 
differ, if only slightly, from real life systems. Hence, it is useful to allow some 
uncertainty in the model and see how it affects the results. 

Intervals offer an elegant way for solving the above tasks. Thus, it is appropri- 
ate to combine interval principles with modeling algorithms, which presupposes 
integration of interval methods into the already existing program MOBILE. This 
integration is performed in three layers. The basic layer is the interfacing between 
MOBILE and interval arithmetic; the package Profil/Bias [3] was chosen to 
provide the appropriate data types and methods. 

Based on this interface, additional structures need to be defined. For exam- 
ple, a simple replacement of floating-point arithmetic with interval arithmetic 
is insufficient because of its undesirable by-products such as the wrapping ef- 
fect. Therefore, we have to improve the “naive” interval extension by exploiting 
knowledge about underlying MOBILE structures. Once this is done, the interval 
extension can be enhanced with more complicated algorithms, for example, for 
solving ordinary differential equations (ODEs) or computing validated distances, 
as well as design new MOBILE components, which allow, for example, uncer- 
tainty in measurements. All that constitutes the middle layer of integration, on 
top of which a connection to the outside world can be considered. Thus, the 
third and last integration layer require building interfaces to industrial modeling 
software [4]. 

Our goal is to implement an extension of MOBILE capable of interval cal- 
culus. To achieve this, we will proceed on two levels: implementation of interval 
kinematics and interval dynamics of mechanical systems. To develop the former, 
basic interval algorithms, such as addition, subtraction, etc. are required, as well 
as more complicated ones, such as solution of interval constraint equations, etc. 
The present state of this side of implementation is reflected in section 3 of this 
paper. To implement interval dynamics, one has to find an interface between 
interval initial value problem solvers (IIVPS) and MOBILE. 

The task of integrating an IIVPS into certain types of modeling software is 
not completely free from difficulties. One of the major problems is obtaining 
derivatives. 

On the one hand, there are several interval algorithms to solve initial value 
problems. Their common feature is the presence of several system function’s 
derivatives. As a rule, the higher their order, the tighter the enclosure obtained. 
The well-known derivative free methods from numerics (Adams, Runge-Kutta, 
etc.) proved themselves hard to adapt to intervals. 

On the other hand, most of the modeling software has no facilities to produce 
derivatives of arbitrary order. The usual methods of automatic differentiation, 
employed in many IIVPS, are impossible to make use of, because they require 
the right hand side of a given problem to be symbolically expressed, while in 
most cases this expression remains unknown. All the information given about 
the system function is its “numerical” values at some arbitrary points and its 
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algorithmic representation in a certain programming language. Therefore, the 
additional task to be solved on this implementation level is obtaining derivatives, 
which comply with the demands of validated algorithms, using only the above 
information. Possible ways of dealing with this problem as well as achievements 
towards modeling of systems’ dynamics are described in section 4 of this paper. 

A short summary of the most important results and a prospect on further 
work can be found at the conclusion of this paper. 

2 The Multibody Modeling Library MOBILE 

The modeling and simulation of the dynamical behavior of mechanical systems 
is a well-studied field in mechatronics. During the last 25 years, a large number 
of researchers have developed several formalisms for the automatic generation 
and resolution of the dynamical equations of multibody systems [5]. Some of 
these methods are still used today as universal engines for mechanics-based cal- 
culations in modern CAD systems, including for example ADAMS [6] in IDEAS 
and SD-FAST [7] in Pro/ENGINEER. Other formalisms have concentrated on 
specific areas of engineering, including for example recursive methods [8,9] for 
robotics, or symbolic computation methods for real time applications [10,11]. 
These approaches have the advantage of being comprehensive and provide com- 
fortable user interfaces. However, due to their monolithic structure they lack the 
efficiency and capability of interaction with other simulation packages. 

The present approach uses object-oriented programming for defining an open- 
architecture multibody library. The mechanical components are modeled as ab- 
stract mappings, termed kinetostatic transmission elements , which transmit mo- 
tion and loads between sets of input and output variables called state objects. 
This results in an intuitive formulation, which allows the designer to put together 
the models of the parts of a mechatronic system in virtually the same way as they 
would actually be assembled in the real world. Moreover, by substituting other 
mathematical objects for real numbers, generic multibody formulations can be 
obtained which can be used for example for interval, stochastic, and fuzzy analy- 
sis. The multibody library MOBILE was implemented using the object-oriented 
programming language C++. Currently, only rigid bodies are modeled, but ex- 
tensions to problems of structural mechanics, hydraulics and control theory can 
be incorporated into the general procedure. 

Mathematically, the operations relating to the kinetostatic transmission el- 
ements correspond to well-known mappings of differential geometry: the trans- 
mission of position and velocity correspond to a nonlinear mapping between two 
smooth manifolds and the corresponding push-forward function for tangent vec- 
tors, while force mapping corresponds to the pull-back function being applied to 
cotangent vectors. 

From a computational point of view, the applied method renders a responsi- 
bility driven client/server model [12] in which multibody operations are defined 
as “services” provided by an object at any time during program execution in- 
dependently of its internal implementation according to a specific “contract”. 
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In MOBILE, the basic “contract” of kinetostatic transmission elements consists 
of two main services: one for transmission of motion (“doMotion”) and one for 
transmission of forces (“doForce”). More elaborated objects are defined at the 
following three levels of modeling complexity: (1) basic modeling, which involves 
only pure kinetostatic transmission elements, (2) sparse- Jacobian modeling, in 
which the interconnection structure and efficient methods for obtaining velocity 
transformations are considered, and (3) inertia-transmission modeling, in which 
the individual components are regarded as Riemannian manifolds able to gen- 
erate and transmit mass properties. A description of the latter two levels of 
modeling complexity can be found under [1] and [13]. 

2.1 The Concept of Kinetostatic Transmission Elements 

The central modeling element for mechanical systems is the kinetostatic trans- 
mission element (Fig. 1), which regards a mechanical component as an element 
MoMap that maps a set of n scalar variables collected in the input vector q to a 
set of m scalar variables collected in the output vector q' . 




Fig. 1. Simple model of a kinetostatic transmission element 

Associated with this mapping, there exist three kinematic functions and a 
force-associated function. The kinematic functions are the mapping itself and 
its first and second derivatives. These are collected in the motion transmission 
functions \ 

position: q' = ip{q) 

velocity: q 1 = J^q > ■ (1) 

acceleration: q = q + q 

Here, = dtp/dq is the m x n Jacobian of the transmission element, 
which is not required explicitly by the clients of the MoMap element. For the force 
transmission function, one assumes that the transmission element is ideal, i. e. 
that it neither consumes nor produces power. Then, virtual work at the input 
and output are equal: 

6q T Q = dq ,T O’ . (2) 

After substituting Sq' = J^> Sq and noting that this condition must hold for all 
virtual displacements Sq £ R n , one obtains 

force: Q = C? , 

where Jj is the transpose of the Jacobian . 



(3) 




136 



E. Auer et al. 



This transformation is directed from the (kinematical) output of the trans- 
mission element to its (kinematical) input. Note also that, in general, need 
not be regular, in fact, not even square, so one cannot assume that (3) can be 
inverted. Thus force transmission is in general directed in the opposite direction 
to motion transmission. 

In MOBILE, each transmission element “remembers” its once defined inputs 
and outputs for its lifetime. Hence, execution of the “doMotion” and “doForce” 
is possible by linking dynamically and without any arguments. Moreover, kine- 
tostatic transmission elements can be concatenated by connecting the outputs 
of one element to the inputs of the other. The transmission functions of such 
a composite transmission element (termed MoMapChain in MOBILE) can be re- 
alized by concatenation of motion transmission in the order of the mechanical 
chain starting at the inertial system, and in reverse order for force transmis- 
sion. In MOBILE, MoMapChain objects are simply ordered lists supporting the 
“<<”-operator for appending a MoMap object on the right to a MoMapChain on 
the left. 

Inputs and outputs of transmission elements can be scalar or spatial quan- 
tities. They are regarded in MOBILE as state objects. Spatial motion is stored 
in frames, while scalar quantities are stored in objects termed scalar variables. 
Each of these objects embraces the complete information regarding position, 
velocity, acceleration, and load. In MOBILE, a frame is represented by an indi- 
vidual object of class MoFrame with members R, r, ang_v, lin_v, ang_a, lima, 
t, f, denoting the rotation matrix, the radius vector, the angular and linear 
velocity vectors, the angular and linear acceleration vectors, the torque and the 
force, respectively. As a convention, all vectors are assumed to be decomposed 
in the moving frames. Scalar state objects in MOBILE belong to the classes 
MoLinearVariable and MoAngularVariable for linear and rotational variables, 
respectively. Both classes have the members q, qd, qdd, Q denoting position, 
velocity, acceleration, and generalized force of the variable. However, for linear 
variables q is of type MoReal, whereas for angular variables q is of type MoAngle. 
The difference between a MoAngle and a MoReal is that the former stores its sine 
and cosine together with the value of the variable, so that these do not have to 
be repeatedly calculated. 

State objects act as interface elements between which the kinetostatic trans- 
mission elements carry out their mappings: objects of the class MoFrame represent 
the junctions by which the mechanical components are connected together, while 
scalar state objects represent the actuator variables which are used to drive the 
joints or to move the bodies along a prescribed trajectory. 

As an example, Fig. 2 shows a simple manipulator consisting of two re vo- 
lute joints and its corresponding MOBILE modeling. Note that this is the entire 
source code of an executable program. Note also, that there is a one-to-one cor- 
respondence between the physical components and their program counterparts, 
and that at the end of the modeling, it is possible to treat the composite sys- 
tem as a simple transmission element by invoking the doMotion function. The 
places where the values of the state objects are prescribed and those where the 
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#include <Mobile/MoElementary Joint .h> 

#include <Mobile/MoRigidLink.h> 

#include <Mobile/MoMapChain.h> 

int mainO 

{ 

// reference frames and actuator variables 
MoFrame KO, Kl, K2, K3, K4; 

MoAngularVariable betal , beta2 ; 

// transmission elements 

MoVector 11(0,4,0), 12(0,3,0); 

MoRigidLink LI ( Kl , K2 , 11 ) ; 

MoRigidLink L2 ( K3 , K4 , 12 ) ; 

MoElementaryJoint R1 ( KO , Kl , betal , z_axis ) ; 
MoElementaryJoint R2 ( K2 , K3 , beta2 , x_axis ) ; 
// complete system 

MoMapChain Manipulator; 

Manipulator « R1 « LI « R2 « L2; 

// definition of actuator positions 
betal. q = 0.25 * M0_PI; 
beta2 . q = -0.35 * M0_PI; 

// transmission of motion 

Manipulator .doMot ion (D0_ALL) ; 
return 0; 

> 



a) system model 



b) executable program 



Fig. 2. Example: Programming of a simple manipulator 



transmission functions are invoked may be apart. For example, one might set 
the kinematical inputs within a module modeling the behavior of a control unit , 
while the doMotion function is invoked in a module, where a sensor measuring 
the approach velocity of another object is modeled. This technique, also termed 
“ modeling by programming ” [1], leads to lean programming models that can be 
incorporated easily and efficiently into other environments. 

For the generation of the equations of motion, a coordinate independent 
approach is employed in which the relevant terms of the equations of motion are 
reproduced from pure motion and force traversals of the system. The equations 
of motion of minimal order are denoted by 

M (q; t)q + b(q, q; t) = Q(q, q; t), (4) 

where q = [gi, . . . .g/] T are the generalized coordinates, M is the fxf generalized 
mass matrix and b and Q are the vectors of generalized Coriolis and centrifu- 
gal force as well as generalized applied forces, respectively. Mass-inertia prop- 
erties can be regarded by computing the corresponding d’Alembert forces and 
motions through corresponding elements termed MoMassElement in MOBILE. 
Applied forces are regarded through a further set of kinetostatic transmission 
elements denominated MoForceElement, which finally takes care of evaluating 
applied forces from the position and velocity state of particular objects and 
applying the resulting forces back to these state objects. As MoMassElements 
and MoForceElements do not further transmit motion, the overall structure of 
a multibody system takes the form depicted in Fig. 3. 
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global kinematics (ips) 




Fig. 3. Model of the inverse dynamics of a multibody system 



If one regards the whole system, consisting of a kinematical subsystem and 
the attached mass and force elements, as one transmission element called global 
kinematics, one can prescribe the motion of the generalized coordinates q , their 
first and second order time derivatives as well as the applied forces, collected in 
a vector and perform the composition of the motion and force transmission 
function as a function (fig which exactly implements the inverse dynamics 
( D _1 ) of the system, i.e. the computation of the residual forces 

Q = Vs 1 ( g , Q , Q ; W (e) ; t ) = M (q-, t) q - Q ( q , q ; W (e) ; t) , (5) 

as a function of q, q, q, and w} e K With this function, it is easy to obtain the 
unknown quantities Q and M of the equations of motion: for Q, one just evaluates 
ifg after setting q = 0, and for M, one first eliminates Q from the calculations 
(by turning off the evaluation of gyroscopic and applied forces and of rlreonomic 
terms) and then calculates <p§ for one acceleration equal to one, e.g. q v = 1, 
while all others are equal zero, giving the i/th column of M. Thus, it is possible 
to obtain the complete dynamics of a system just by using the kinetostatic 
transmission elements. 

3 Integration of Interval Arithmetic in MOBILE 

In this section we first give a brief introduction of interval arithmetic which is 
provided by the basic scalar data type Molnterval in MOBILE. It is possible 
to achieve guaranteed simulation results with this data type. Uncertain inputs, 
which, for example, arise from sensor measurements, can also be modeled. More- 
over, the use of interval arithmetic increases the number of modeling options, for 
instance, by introducing a revolute joint with slackness. Hence, interval extension 
makes MOBILE more powerful. 

To obtain not only reliable but also useful results, we have to tackle typical 
problems of interval arithmetic such as the wrapping effect. We show how to 
cope with the wrapping effect in MOBILE and modify kinetostatic transmission 
elements for this purpose. 
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3.1 Basics of Interval Arithmetic 

Floating-point arithmetic can lead to erroneous results even if computations are 
made according to the IEEE 754 standard [14]. A remarkable example is given in 
[15]. To obtain a verified enclosure for the exact result, interval arithmetic can be 
used, in which all calculations are performed with two floating-point numbers, 
a lower and an upper bound for each of the inputs and results. In conjunction 
with directed roundings, we can obtain an enclosure of the exact result on a 
computer. 

Another advantage of intervals is the handling of uncertain data, e.g. data 
resulting from measurements, which can also be represented as intervals. 

Now we can give a brief survey of interval arithmetic. Consider a real interval 
X = \x, x] := {a; G R. | x < x < x}. For any arithmetic operation o G {+, — , •, /} 
and intervals X, Y , we can define the corresponding interval arithmetic opera- 
tions: 

X o Y := {x o y \ x € X, y € Yj 

= {x o y \ x< x <x,y < y <y} 

= {z | rnin(x o y, x o y, x o y, x o y) < z < max(l o y, x o y, x o y, x o y)} . 

With z := min(a: o y, x o y, x o y, x o y) and z := max(i o y, x o y, x o y, x o y) , we 
obtain 

Z := \z,z\ = XoY . 

For standard functions /, e.g. sin, cos, exp, In, there are algorithms to com- 
pute an accurate interval containing the value set Vf(X) := {f(x) | x £ X} of 
/, where X is an interval [16]. 

Thus, we are able to compute an enclosure of the value set Vf(X') for any 
function / which is composed of fundamental operations and standard functions. 
We replace the variable x by an interval X with x € X and compute f(X) using 
appropriate interval operations. Then, 

f{x) G V f (X) C f(X) for all x G A . 

Note that f(X ) depends on the representation of f, i.e. it is possible that 
f(X) yf g(X) for / = (/ [17] . 

We only have a finite number of machine numbers for calculations on a 
computer. A machine interval is represented by two machine numbers. For an 
interval [x, 5:] , we round x down to the largest machine number equal or less 
than x , and x up to the smallest machine number equal or greater than x. 

3.2 The New Data Type Molnterval 

The basic floating-point data type for real scalars in MOBILE is MoReal. Cur- 
rently, it is the same as double in C++. A new data type has been introduced 
into MOBILE for interval calculations: the class Molnterval. Besides a variable 
of the type INTERVAL from the interval arithmetic package Profil/Bias [3], 
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Molnterval also provides a double variable for bounding the absolute compu- 
tation error. This bounding is computed automatically as described in [18]. 

A variable of the type Molnterval can be used in the same way as a MoReal 
variable because the arithmetic operations and the standard functions are over- 
loaded. Besides, an interval given by its lower and upper bound can be assigned 
to a Molnterval variable. 

In each case where a floating-point number is used to construct a Molnterval 
value, it is guaranteed that the (decimal) number represented by the correspond- 
ing string in the MOBILE program is completely enclosed by the interval value of 
Molnterval. For example, Molnterval (0 . 1) and Molnterval (0 . 1 , 0 . 1) both 
result in the same interval enclosing exactly three consecutive floating-point 
numbers a < b < c, where b is the floating-point representation of the decimal 
value 0.1. Of course, this is an overestimation, because there is an interval en- 
closing 0.1 containing only two consecutive floating-point numbers for lower and 
upper bound. This can be achieved by using the constructor with string input: 
Molnterval ("0. 1"). 

The output of a Molnterval variable is printed in the following way: first an 
enclosing interval is printed because of the internal binary representation the 
lower bound may be rounded down and the upper bound may be rounded up 
for output — then a maximum absolute error is printed in parentheses. 

A list of constructors, member operators, and member functions can be found 
under [19]. 



3.3 Extended MOBILE Objects 

The main goal of the new extension package is “easy handling” , which means 
that a user familiar with MOBILE does not have to learn a new language. 
Of course, there are new classes for mathematics and kinetostatic transmission 
elements, which provide interval arithmetic [19], but they can be used in the 
same way as the corresponding well-known MOBILE classes. Only the names 
are different. All names of MOBILE classes start with Mo, e.g. MoRigidLink, 
and the names of the corresponding interval classes begin with Molnterval, e.g. 
MoIntervalRigidLink. 

Besides the names of classes, some of the names of constants have been 
changed for interval purpose. For example, INTERVAL_PI replaces the floating- 
point constant MCLPI and represents a Molnterval enclosing ir. 

The MOBILE program in Fig. 4 describes the model of the example from 
section 2.1 and calculates the position of its tip. It is assumed here that rigid 
links have certain masses and elementary joints rotate about the same x axis. 
This program is derived from the classical MOBILE one just by placing the 
words Interval and INTERVAL in the correct positions. The resulting output of 
the classical MOBILEprogram is 

Position = (0,5.6816,1.90138) , 

whereas the output of the interval version reads as follows: 
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#include <Mobile/MoIntervalElementaryJoint .h> 

#include <Mobile/MoIntervalRigidLink . h> 

#include <Mobile/MoIntervalMassElement .h> 

#include <Hobile/MoIntervalMapChain . h> 
int main () { 

MoIntervalFrame KO , Kl , K2 , K3 , K4; 
MoIntervalAngularVariable betal,beta2 ; 

HoIntervalVector 11 , 12 ; 

Holnterval ml, m2 ; 

MoIntervalElementaryJoint R1 ( KO, Kl, betal, xAxis ) ; 

MoIntervalRigidLink rodl ( Kl, K2, 11 ) ; 

MoIntervalElementaryJoint R2 ( K2, K3, beta2, xAxis ) ; 

MoIntervalRigidLink rod2 ( K3, K4, 12 ); 

MoIntervalMassElement Tipi ( K2, ml ) ; 
MoIntervalMassElement Tip2 ( K4, m2 ); 

MoIntervalMapChain Manipulator ; 

Manipulator « R1 << rodl « Tipi « R2 << rod2 << Tip2 ; 

11 = MoIntervalVector (0,4, 0 ) ; 

12 = MoIntervalVector (0,3, 0 ) ; 

ml = 0.1 ; 

m2 = 0.1 ; 

betal . q = 0 . 25*INTERVAL_PI ; 
beta2 . q = -0 . 35*INTERVAL_PI ; 

Manipulator .doMot ion (D0_INTERVAL_ALL) ; 
cout « "Position = " << K4.R*K4.r << endl ; 



Fig. 4. The interval program for the manipulator (kinematics) 



Position = ([0,0] ( 0 . 0000000000000000E+0) , 

5 . 6815966736316 [161,819] ( 1 . 9739362421216854E-14) , 

1 . 9013761416213 [121,817] ( 1 . 8214953086484989E-14) ) . 

The three intervals contain the exact solutions for the three components of the 
position vector. The value in parentheses after an interval indicates the corre- 
sponding maximum absolute computation error. 

3.4 A Sloppy Joint 

Interval arithmetic is not only a means of computing verified results or handling 
uncertain data, it also enhances the modeling opportunities of MOBILE. 

In [20], a sloppy revolute joint is modeled to cope with real world revolute 
joints. The rotation axes of two connected bodies are no longer assumed to 
be exactly concentric, but the relative distance between these axes is within a 
specific (small) range. 

Fig. 5 shows the CAD drawings of a sloppy joint. The corresponding model 
with the inserted sloppy link is displayed in Fig. 6. 

The parameter tpi that describes the relative orientation between two con- 
nected bodies is the same for the sloppy joint and for an ideal one. Two more 
parameters are added for the sloppy joint to describe the unique positions of the 
two bodies. 

Assume that the two bodies are not coupled directly, but connected by one 
additional rigid link, called a sloppy link lj. In MOBILE this sloppy link is 
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Fig. 5. CAD model of a sloppy joint 



modeled as a regular rigid link that connects the two bodies. This connection 
is achieved by using two regular revolute joints. The two additional parameters 
are the length li of the link and the relative orientation angle cti. The length U 
can be chosen in the interval [0, /[ nax ] and the orientation angle can be any angle 
o.i G [0, 27t[. These parameters are of the Molnterval data type. 




Fj+i = Fj 

Mj+i = Mi + h x Fi 



Fig. 6. Calculation of position, force, and torque in a sloppy joint 



An example of a manipulator built with sloppy joints and a comparison of 
results obtained by interval arithmetic and Monte Carlo simulation are presented 
in [20]. 



3.5 Avoiding the Wrapping Effect 

The change of data types in MOBILE from MoReal to Molnterval leads to veri- 
fied simulations but not always to useful results because of the dramatic influence 
of the wrapping effect. This influence can be demonstrated in the example from 
section 3.3 with sloppy joints instead of MoElementaryJoint objects (we replace 
them with MylntervalSlackness Joint objects). 

Consider a sloppiness in each joint of 0.02, i.e. the maximum distance between 
the reference points of the frames KO and K1 is 0.02, the same holds for K2 and 
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K3. The task is to determine the position p of the tip, i.e. the position of the 
reference point of K4 with respect to KO. 

The computed result in MOBILE with (naive) interval arithmetic is the in- 
terval vector 



/ [ 0 , 0 ] 

p = [5.6224993009825814,5.7406940462807193] 

\ [1.8422787689722757, 1.9604735142704182] 

Here we have an overestimation in the second and third component, where the 
diameters of both intervals are about 48% larger than the exact ones. This is 
due to the fact that the disc (in the following we disregarded the first dimension 
because we dealt with a plane problem) for the position of the reference point 
of KO is enclosed in a square interval which is then rotated by 45° and wrapped 
with an interval again. The same happens when the rotation in the second joint 
is made. 

This overestimation can be avoided in several ways. Since we started with 
a disc, a rotation should not lead to a larger disc except for rounding error 
accumulation. Hence, we could use midpoint-radius calculations. However, as 
shown in [20] , the shape for possible positions of an end-effector of a manipulator 
with several uncertain inputs can still be really unpredictable. 

Consider mapping from the position po of the reference point of frame KO to 
the position p 2 of the reference point of frame K2 

(io o \ /ON 

P 2 = Po + 0 cos r — sin r I • I 4 I 

\ 0 sin r cos r ) \ 0 / 

with r = j7r. If po is an interval vector, it is translated without any rotation 
or deformation. The same holds for the mapping p 2 K > p. This leads to another 
approach in avoiding the wrapping effect. 

Unfortunately, a redesign of MOBILE classes is necessary: for example, the 
class MoFrame, which stores the kinematic and static state of an “oriented” point 
in space, and transmission elements like MoRigidLink or MoElementaryJoint, 
which map this information to other MoFrame objects. If we additionally store the 
input error of the position as an interval, we need to do this in global coordinates. 
In the future the described approach will also be used to avoid the wrapping effect 
in the velocity and acceleration. 

For the above example with the modified transmission elements the resulting 
interval vector is then 




( [ 0 , 0 ] \ 

[5.6415966736316107,5.7215966736316970] , 

\ [1.8613761416213070, 1.9413761416213929 ] ) 



which is a tighter enclosure with respect to rectangular intervals. 
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4 Interval Modeling of Dynamics 

This section describes an approach to validating dynamics in MOBILE and 
presents an IIVPS for “numerically” modeled mechanical systems. The perfor- 
mance of the solver is demonstrated in some examples. 



4.1 An Approach to Interval Modeling of Dynamics in MOBILE 

Before speaking about interval modeling of dynamics in MOBILE, let us digress 
for a moment and consider multibody modeling software in general. It can be 
roughly divided into two types, which are nominally denoted here as “symbolic ” 
and “numerical”. They differ in how they represent the model of a given me- 
chanical system. The former produces an explicit symbolic description of the 
resulting differential equations. For instance, if a model for the harmonic os- 
cillator is sought, with parameters m and k, this type of software delivers the 
equation mx(t) + kx(t) = 0. A drawback of this kind of software is its slowness; 
besides, the explicit representation is so intricate in most cases that it has to be 
manually simplified. 

Packages of the “numerical” type dispense with the symbolic description 
of the model for the sake of higher computational speed. They provide values 
of the specified model parameters at some arbitrary points (hence the name 
numerical) . These values are “exact” in the same manner as the values obtained 
by substitution of given points into the symbolic description are “exact” . The 
only difference is that the general expression remains unknown in the former 
case. 

It seems important here to point out two different meanings of the word 
“numerically”. One meaning expresses the fact that no explicit “symbolic” de- 
scription of the resulting entities is produced. Another indicates the presence 
of a certain approximation error and, consequently, the loss of accuracy. In our 
context, this meaning of the word “numerically” can be contrasted with that of 
the word “symbolically” , which stands for differentiation without approximation 
errors with the help of the usual automatic differentiation techniques (e.g. [21]). 
These two meanings should not be confused: if we speak here about “numerical 
software”, we mean it in the first sense. 

The problem which appears on building an IIVPS into MOBILE consists 
in seeming impossibility of the “exact” or “symbolic” differentiation in case of 
“numerical” software. MOBILE belongs to the “numerical” type, but to model 
dynamics with interval methods, one has to compute many derivatives of a sys- 
tem function, which are not provided by the program. 

There are few options to handle this problem. The use of divided differences 
or other methods of numerical differentiation is not allowed because it contradicts 
the general idea of verification, which is to guarantee that the solution lies within 
a certain interval. Nothing can be guaranteed if truncation errors in derivatives 
are not taken into account. 

Therefore, the next option is to obtain derivatives by considering system’s 
physics. There exist some theories that deal with the finding of Jacobians and 
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Hessians of a modeled function (e.g. [1]). Probably, it is possible to develop 
similar theories about higher-order time derivatives and their Jacobians, but 
not without thorough understanding of the underlying subject matter. A serious 
limitation of this approach lies in the necessity to reconsider the whole theory 
to find a derivative of order n + 1 even if an algorithm for the n-th one is 
known. Studies that deal with this kind of problems and attempt to automatize 
this process are unknown to the authors. In short, this approach might provide 
the necessary derivatives, but research on that topic must constitute a separate 
study, which would be best performed by a physicist. 

On closer look at the problem, the third option, which we use in MOBILE, 
becomes evident. Numerical differentiation might be the answer if all what is 
known about a function is its values at some given points, but in our case the 
algorithmic representation of this function in some programming language is 
known as well. This piece of code can be treated as the “explicit” symbolical 
representation of the function and help to reach its derivatives with the usual 
techniques of automatic differentiation. This variety of “exact” differentiation, 
called algorithmic differentiation [22] , has been applied to MOBILE. By summa- 
rizing the theoretic aspects of both algorithmic differentiation and IIVPS, the 
following section provides the understanding of what derivatives are necessary 
to model dynamics with interval methods and how to retrieve them. 



4.2 Theory Overview: IIVPS and Algorithmic Differentiation 

The idea of interval solution of initial value problems is not new. There heve 
been several studies in this area: see [23,24], Most studies incorporate or develop 
the methods presented by R. Lohner [21]. 

The task is formulated as follows: The set of autonomous initial value prob- 
lems 



f y'(t) = f(y) 

1 y(t 0 ) e [yo] 



(6) 



is considered, where t £ [to,t n ] C R for some t n > t 0 , f € C' p_1 (2?), V C ]R m 
is open, / : D R m , and [y 0 ] C V. The problem is discretized on a grid 
tg < t\ < ... < t n with hj - 1 = tj — tj- 1 - The solution of (6) with an initial 
condition y(tj- 1 ) = yj-i is denoted by y{t\tj-i,yj-\) as well as the set of 
solutions {y(ft j - 1 ,y j -i) \ yj-i £ [yj-i]} by y(t\ tj-i, [yj-i]). The goal is to 
compute interval vectors [yj],j = 1 ,...,n, that are guaranteed to contain the 
solution of (6) at ti, . . . , t n . That is, y{t\ t 0 , [yo]) C [yj], j = 1, . . . , n. 

The jtlr step of most validated methods consists of two stages [24] : 



1. Proof of existence and uniqueness. Compute a stepsize hj-i and an a priori 
enclosure [yj-i] of the solution such that y(t; tj_i, Vj—i) is guaranteed to 
exist for all t £ and all yj-i £ [yj-i] and y(t;tj- 1 , [l/j-i]) Q [Hj- i] 

for all t £ [tj-\, tj\. 
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2. Computation of the solution. Compute a tight enclosure [yf\ C [yj~\] such 
that y(tj\ to, [yoD Q [Vj]- There are several algorithms to solve this problem, 
which follow roughly the same scheme. 

2.1. Choice of the underlying method. One can choose a one-step method 



y{t\ tj,yj) = y(t\ tj- 1 , %-i) + tj-i, Vj-i)) + , 



where is some method function, and Zj is the local error. The usual choice 
for (p is Taylor series expansion. 

2.2. Enclosure of the local error. Find an enclosure for the local error zj. In case 
of the Taylor series expansion of order p — 1, this enclosure is obtained as 
[zj] = /i^_i/ W (fe-i]), where / W (fe- 1 ]) is an enclosure of the p-th Taylor 
coefficient over [j/j-i]. 

2.3. Enclosure of the global error. Compute a tight enclosure of the solution. In 
the case of the Taylor series expansion of order p—1 the resulting formula is 



approximate solution 

A 

P~ 1 



[Vj] = Vi - 1 + h i- fe'- 1 ) +M 

global error 



( 7 ) 



i= 1 



+ ^7 + Y. h UJU [l \ ([Vj- 1 ] — Vj-i) > 



where yj~\ € [yj-i], J(/^, [yj-i]) is the Jacobian of evaluated at [yj~\]- 
These Jacobians are equal to the Taylor coefficients for the solution of the 
associated variational equation 



Y' = %Y, = I , 

dy 



(8) 



which leads to an algorithm for their computation [21]. This is the so-called 
direct Taylor series method for the global error propagation, which in most 
cases overestimates the enclosure due to the wrapping effect [21]. Some meth- 
ods to reduce this effect use non-orthogonal (“parallelepiped”) and orthog- 
onal (“QR factorization”) coordinate transformations, ellipsoids, zonotopes, 
and Taylor models [25]. 

This approach to validated integration requires computation of Taylor coef- 
ficients as well as of their Jacobians for the system function f(y) (/^ (dj-i) and 
J(/M, [yj- 1 ]), respectively). Those can be retrieved with the help of algorithmic 
differentiation ([23,22]). 

To be able to apply this form of differentiation we have to ensure that the 
source system meets some requirements: 

First, a set of elementary operations (+, — , •, / and elementary functions such 
as sine or exponential) is specified. It is assumed that the right hand side of the 
system / consists only of operations from this set. 
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The second assumption is, that though the explicit expression for f(x) re- 
mains unknown, its algorithmic representation, that is, a step-by-step specifica- 
tion of evaluation of this function for given x in terms of the previously defined 
operations and functions, is given. An example of such a specification is a rou- 
tine in some programming language. There are different ways to formalize this 
concept, for instance, a computational graph. It is here a directed acyclic graph, 
in which nodes and edges represent the elementary operations and their de- 
pendencies. This kind of formalization helps to develop data structures, which 
“record” the execution of the goal function and in this way make implementation 
of algorithmic differentiation possible. 

At last, elemental differentiability is assumed, that is, that every elementary 
operation is continuously differentiable up to some order p, 0 < p < oo. Under 
this assumption, the chain rule can be applied to the mathematical formalization 
of the algorithmic representation [23] . 

The result of this application is a system of linear equations, which can 
be solved either by forward or by backward substitution. That is, by knowing 
only the derivatives for the elementary operations, it is possible to obtain the 
derivatives of an arbitrary function in which they consist of. 

The two approaches mentioned above form the forward and backward modes 
of algorithmic differentiation respectively. The former was employed in MOBILE 
to get the Jacobian of the right side of a modeled initial value problem. 

Also, Taylor coefficients up to the order p of a given function can be obtained 
by knowing the rules of their calculation for elementary operations, of which this 
function consists ( p here is the order defined by the assumption of elemental dif- 
ferentiability) [21]. Combining forward mode and Taylor coefficients’ algorithms, 
it is possible to find necessary Jacobians of the Taylor coefficients. 

The next section specifies the software which implements the methods men- 
tioned above and its place in IIVPS. Besides, different IIVPS are compared from 
the point of view of their later integration into MOBILE. 



4.3 Available Software and Choice for Integration into MOBILE 

It appears more difficult to build a validated solver than a standard one if we 
think in terms of additional software involved. In addition to implementation of 
the actual algorithm, a validated solver must incorporate an interval arithmetic 
library and a package for automatic differentiation to compute interval Taylor 
coefficients for the solution of an ODE and of the associated variational equation 
(see section 4.2). 

As already mentioned, a package for automatic differentiation based on the 
algorithmic representation of functions is required in the present case. It was 
decided to use FADBAD [26] and TAD IFF [27], both of which are built on 
top of the interval library Profil/Bias. This library was also used to extend 
kinematics in MOBILE to the interval case (section 3), so that the choice of 
the algorithmic differentiation software was mostly predetermined by this fact 
as well as by the programming language (C++ as in MOBILE). 




148 



E. Auer et al. 



As to the IIVPS actual algorithm, one can choose between several pack- 
ages: AWA [21] and its C++ version AWACOO [28], ADIODES [23], COSY 
INFINITY [29] and VNODE [24]. But as experience showed, none of them can 
be integrated into MOBILE as is. The choice between these packages has to 
be made primarily according to their programming language and availability of 
algorithmic differentiation option. 

AWA and COSY INFINITY can be ruled out right away, because they 
are programmed in PASCAL-XSC and FORTRAN, respectively. All the oth- 
ers match MOBILE in language, but still they are not interchangeable. Though 
all of them solve systems given in their exact symbolic representation, which is 
not suitable for integration into MOBILE, VNODE and ADIODES do that in 
an “enhanced” way and use the packages FADBAD and TADIFF for differenti- 
ation. VNODE has some advantages in efficiency over ADIODES and is easier 
to use; besides, it introduces some new validated algorithms. For that reason it 
was decided to take this package as a basis of an interval solver in MOBILE. 

Hence, the process of integration of an IIVPS into MOBILE can be car- 
ried out as follows: first, enhancing MOBILE with algorithmic differentiation; 
second, adjusting VNODE to MOBILE; third, assembling the parts to the ver- 
ifying extension. This approach offers the facility of automatized calculation of 
derivatives in MOBILE directly, which helps to validate the parameters of a 
mechanical system and treat the uncertainty in input data. 



4.4 An Interval Extension of MOBILE for Modeling of Dynamics 

Fusion of VNODE and MOBILE. Before talking about implementation of 
an interval solver in MOBILE, it is necessary to consider programming of such 
a solver in general, which can give some insight into problems arising on VN- 
ODE and MOBILE’S fusion. To implement the theory outlined in section 4.2, 
the Taylor coefficients / M of the solution to the given system as well as those 
of the solution to the associated variational equation have to be generated. To 
achieve this, one transforms the algorithm for / to get an algorithm for deriva- 
tives. One of the popular techniques of such a transformation is overloading: 
The definitions of the quantities and elementary operations involved in the algo- 
rithmic representation of / are supplemented with corresponding differentiation 
rules. Therefore, to solve au initial value problem with the help of algorithmic 
differentiation implemented through overloading, three data types are required: 

1. One to get the actual values of the right hand side (for example, INTERVAL 
from Profil/Bias) 

2. One to get the Taylor coefficients of the solution (for example, TINTERVAL 
from TADIFF) 

3. One to get the Taylor coefficients of the solution to the associated variational 
equation (TFINTERVAL from FADBAD /TAD IFF) 



As a result, we are confronted with a “designing problem”: On the one hand, 
a function to compute the right hand side is given, on the other hand, two 
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further types of computational graphs are required, so two additional functions 
appear, which differ from the first one only in data types. It can inflate the 
code, especially in case of MOBILE, where the goal function is “assembled” 
from subroutines of many transmission elements. 

VNODE solves this problem very elegantly with the help of templates. At 
the same time, this fact as well as some other usage features of the program 
(see [24]) prevents its “as is” integration into MOBILE. For example, one has 
either to know the exact expression for the right hand side or to modify all the 
sub functions involved in the evaluation of this right hand side into templates. 
As already mentioned, the former alternative is not acceptable for MOBILE, 
which does not provide any explicit symbolic representations. The latter one 
also seems impracticable, because the body of sub functions imposed by trans- 
mission elements is too large and intricate to be replaced by template analogs. 
Another such example is that a MOBILE user has to work with VNODE (and 
not MOBILE itself) to model the dynamics of his interval systems. These and 
other features of VNODE have to be adjusted to MOBILE. The next sections 
describe how this adjustment was put into practice. 



Enhancing MOBILE with Algorithmic Differentiation. As already men- 
tioned at the end of section 4.3, the process of integrating VNODE into MOBILE 
should start with making MOBILE capable of algorithmic differentiation to pro- 
vide for the derivatives required. This enhancement is carried out with the help 
of the packages FADBAD/TADIFF, which use operator overloading for algo- 
rithm transformation. Therefore, a new data type is required to manage all the 
computations automatically, with minimum effort on the user’s side. The use of 
algorithmic differentiation is legitimate in this case, because the set of elemen- 
tary operations in MOBILE (for the transmission elements we are considering 
at present) consists of addition, subtraction, multiplication, division, sine and 
cosine functions, which satisfy the assumption of differentiability introduced in 
section 4.2. We would like to point out that it holds for an arbitrary p , because 
the branching (IF-statements) which is present in the transmission elements does 
not depend on the argument of the goal function and thus has no influence on 
the order of differentiability. 

The basic AD data type of the extension is called MoADInterval. Its struc- 
ture is shown in Fig. 7. With the help of this construction it is possible to obtain 
all computational graphs fast and through a single function call. On the other 
hand, the use of memory is inefficient, so this data type is in need of further op- 
timization. For example, a hierarchy based on inheritance (MoADInterval from 
MoTInterval from Molnterval) would help to avoid unnecessary memory allo- 
cation for instances TFINTERVAL if only Taylor coefficients of the goal function 
(and not those of the variational equation) need to be computed. 

To provide for algorithmic differentiation, all MOBILE structures have to be 
modified with the help of MoADInterval. This modification presupposes system- 
atic replacement of all variables of ordinary MOBILE data types with ones of the 
corresponding algorithmic differentiation types. At first, all instances of MoReal 
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class MoADInterval { 

// Data 

INTERVAL Enclosure; // the function value 

TINTERVAL TEnclosure; // the Taylor coefficients of the solution 
TFINTERVAL TFEnclosure ; // those of the variational equation 
//Methods 

> 



Fig. 7. The basic AD data type 



are replaced with MoADInterval, then those of MoAngle with MoADInterval- 
Angle, and so on. Also, some restructuring is required. It originates rather in 
the conversion from floating-point to interval computations than in algorithmic 
differentiation itself (e.g. usage of a verified linear systems solver instead of an 
ordinary one). 

At present, the following transmission elements are extended to work with 
the new data type: MoADIntervalMap, MoADIntervalElementaryJoint, MoAD- 
IntervalSpherical Joint, MoADIntervalMassElement, MoADIntervalRigid- 
Link, and the part of MoADIntervalSpringDamper, which does not contain 
square roots. The objects for pre-modeling of dynamics are transformed as well: 
MoADIntervalMapChain (provides the basis for the concatenation of elemen- 
tary kinetostatic transmission elements into complex composite systems), Mo- 
ADIntervalEqmBuilder (computes the equations of motion of a mechanical sys- 
tem), MoADIntervalDynamicSystem (provides a basic interface for obtaining of 
the model equations in state-space form), and MoADIntervalMechanicalSystem 
(computes the equations of motion of general mechanical systems in state-space 
form) . 

Most changes have to be made to the member function SolveLinearSystem 
of the class MoADIntervalEqmBuilder. This function is used to obtain acceler- 
ations from the known mass matrix and force vector. We cannot call the stan- 
dard routine for linear systems’ solving provided by Profil/Bias. The func- 
tion Lss from Profil/Bias does not allow for its usage with data types other 
than INTERVAL. However, both computational graphs are necessary for accel- 
erations, because they constitute a part of an ordinary differential equations 
system in state-space form. Therefore, the basic validated algorithm [30] has to 
be reimplemented for the data type MoADInterval. The simple duplication of 
this algorithm, which is already slower for intervals, proves to be too expensive 
in practice: apart from taking more time than floating-point algorithms, it en- 
larges computational graphs too much, which considerably slows down actual 
integration. Other solutions, which better suit algorithmic differentiation and 
accelerate further computations, require thorough understanding of MOBILE’S 
inner algorithms and are being developed at present. 

Some of the other methods employed in MOBILE’S transmission elements 
can be optimized with respect to intervals as well (for example, to reduce over- 
estimation). But at this stage of development the authors are more concerned 
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with the implementation of the validated integrator itself rather than with over- 
estimation due to one-to-one replacements of floating-point values with interval 
ones. In case of a linear systems solver it is crucial to choose an appropriate 
method; in other cases this can be temporarily neglected. 

With all mathematical, state objects, and transmission elements modified, 
one can start the actual implementation of an IIVPS for mechanical systems 
modeled by MOBILE. 



The Interval IVP Solver: MoADIntervalAWAIntegrator. To comply with 
the usual MOBILE interface for solving initial value problems, two of its classes 
(Molntegrator, MoAdamsIntegrator) were modified. If the changes required by 
the former were of rather formal character (mostly alterations in type names), 
the latter was to incorporate the modified VNODE and, consequently, had to 
be thoroughly transformed. 

The base class for all integrator objects in MOBILE as well as in its verifying 
extension is Mo (ADInterval) Integrator (Fig. 8 shows the latter version of its 
implementation). It provides the basic variables and routines, which are to be 
inherent in all objects capable of solving modeled initial value problems for given 
mechanical systems. 



class MoADIntervallntegrator : public MoADIntervalMap{ 
protected: 

MoADIntervalDynamicSystem* System ; // the system to be solved 

int neq ; // number of equations in the system 

MoADInterval* Y ; // the solution 

MoADInterval* Yd ; // its time derivative (the right side of the system) 

MoReal Time ; // the current integration point 

MoReal tStart ; MoReal tEnd ; //the starting and end point of integration 
MoReal tlnterval ; //the stepsize 

public : 

MoADIntervallntegrator () ; 

MoADIntervallntegrator ( MoADIntervalDynamicSystem &sys_ ) ; 
"MoADIntervallntegrator () ; 
virtual int getOrder ( ) ; 

virtual void giveState ( MoADInterval* st ) ; 



Fig. 8. The base class for integrator objects 



This class is derived from the abstract class MoADIntervalMap. The “philoso- 
phy” of MOBILE presupposes that after setting the integration interval (tStart 
and tEnd), one can consider every integrator object as a transmission element 
(hence the derivation from MoADIntervalMap) that travels along the solution 
trajectory in small steps determined by tlnterval. This way it is easier to 
communicate with the visualizing part of the software. In the interval exten- 
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sion of MOBILE, though, visualization is not as important as the problem of 
obtaining guaranteed enclosures. Besides, one of the special features of interval 
methods is the internal step size control, which either accelerates the process 
of system’s integration or provides the time interval over which the solution is 
proved to exist. Therefore, the interval extension of MOBILE does not allow for 
user-administrated step size control at present. It implements its integrators as 
transmission elements which provide the solution at the end point, “document- 
ing” their way to that state in a text file. On the other hand, a facility for interval 
visualization (whatever it may be) might become necessary in the future, so the 
variable tlnterval is preserved in the base class. At present, it is equal to the 
difference between tEnd and tStart, but it is still possible to derive an object, 
in which this stepsize is controlled from the outside. 

A class to handle the solution of initial value problems, which is derived from 
the class with above properties, is implemented as shown in Fig. 9. 

class MoADIntervalAWAIntegrator : public MoADIntervallntegrator { 

// work space 

MoADInterval* Tin; MoADInterval* Tout; 

// cahracteristics of the integrator 
int T_0rd; 

IntegrationAlgorithm Algorithm; 

IntegrationFunction CompEncl; 

// internal integrator functions 

void dfn (...); //computational graphs of the right side of the system 

void tGenerateTerms ( . . . ) ; ... // Taylor coefficients of the solution 

void vGenerateTerms ( . . . ) ; ... // Taylor coefficients of the solution to the variational eq. 
bool PredictStepC ...);... // stepsize control 

bool Validate ( . . . ) ; // the proof of existence 

void CompEnclITSDirect ( . . . ) ; // the direct Taylor series method 

void CompEnclITSQr ( . . . ) ; // the QR-f actorization method 

// internal auxiliary functions 

public : 

MoReal TOL; //tolerances 

void doMotion( ... ) ; //integration itself 

>; 



// the order of the Taylor series 
// the name of the algorithm of solution 
//execution of the respective algorithm 



Fig. 9. The interval integrator class 



Apart from data defined by MoADIntervallntegrator, the object contains 
some arrays to store the necessary computational graphs (Tin, etc.), a relative 
tolerance variable, and some other auxiliary information. At present it incor- 
porates two validated methods: the direct Taylor series method and the QR- 
factorization method. The constructor of MoADIntervalAWAIntegrator takes 
the name of the method (stored in Algorithm) and the order of Taylor series 
decomposition (T_0rd) as parameters and sets the function pointer CompEncl 
to either CompEnclITSDirect or CompEnclITSQR depending on what method 
has been chosen. Functions for the generation of the Taylor coefficients of the 
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solution to the system (tGenerateTerms, etc.) and to the respective varia- 
tional equation (vGenerateTerms, etc.) are now member functions of the class 

MoADIntervalAWAIntegrator. 

The member function doMotion, which is inherent in all transmission el- 
ements, starts the process of integration and calls the function AWACDO. This 
function is the adaptation of VNODE’s V0DEJ30LVER: : IntegrateTo. It chooses 
the stepsize control (PredictStep) and existence proof (Validate) strategies as 
well as the methods for tight enclosures (CompEncl) according to the parameters 
predefined by the constructor and executes the actual algorithm. 

A characteristic feature of VNODE’s adjustment to MOBILE is the restruc- 
turing of the original object hierarchy of the former in subordination to that 
of the latter. It becomes necessary partly because of the decision to avoid tem- 
plates, partly because of the complexity of that hierarchy, resulting from the 
intention of VNODE’s developers to provide maximal flexibility in the choice of 
solution strategies in order to compare them, which is not the primary goal of 
an interval solver for MOBILE. The changes can be led down to the following: 

- Merging INTERVAL, TINTERVAL, and TFINTERVAL into a single data type 
which helps to avoid templates 

Substitution of the branch in the VNODE’s object hierarchy responsible 
for computational graphs by additional work space arrays and respective 
methods in MoADIntervalAWAIntegrator 

Absence of separate object branches for stepsize and order control strategies 
as well as for the existence proof algorithm 

- Identification of the VNODE’s solver classes with the MOBILE class Mo- 
ADIntervalAWAIntegrator 

One can still choose the order of Taylor series, the necessary integration method, 
etc. But unlike VNODE’s assembling of the particular integrator object from 
many instances of other classes, this choice is performed through a single call of 
the corresponding MoADIntervalAWAIntegrator’s constructor. 

MoADIntervalAWAIntegrator allows us to obtain validated enclosures of dy- 
namic parameters for general mechanical systems modeled in MOBILE. The 
next section describes the usage and performance of this class. 

Basic Usage of MoADIntervalAWAIntegrator and its Performance. With 
the help of the modified transmission elements, the feature of easy handling can 
be introduced for modeling of dynamics in the same manner as for kinematics in 
section 3. The main difference from the normal MOBILE program is the attach- 
ment of the identifier ADInterval to usual data type names. Some constants 
are also turned into interval ones. Then, the use of the validated initial problem 
solver MoADIntervalAWAIntegrator provides for the dynamics of the system. In 
the interval case two additional (in comparison with the usual mode) parameters 
are used by the constructor to determine what order of Taylor series and what 
validated method are to be applied to solve the initial value problem. 

Let us consider the dynamics of a simple pendulum with a damper and com- 
pare our results with those obtained with the Adams-Moulton-Bashfort (AMB) 
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integration algorithm, which are represented in Fig. 10. The pendulum starts the 
movement at a twenty degree angle with zero velocity and is considered over the 
time interval [0; 10]. The tests were performed on a Pentium IV (CPU 2.26GHz, 
RAM 256 MB). The plot on the left shows the dependence of position and ve- 
locity of the pendulum on time, both for the validated and numerical integrator. 
In this scale one can discern neither the differences between the upper and lower 
bounds of the solution set, nor its difference from the trajectory obtained with 
the AMB algorithm. The plot on the right demonstrates that these differences 
exist. It represents the position in relation to midpoints of intervals obtained 
with the QR-factorization method. In the scale of 10 -11 , it is evident that the 
AMB curve lies outside of the validated boundaries in that section of the dia- 
gram, where the solution oscillates, but towards the end of integration interval, 
where the solution stabilizes, it lies within or near them. The explanation of 
this effect may be the error of the AMB algorithm, which is considerable for 
oscillating solutions. The computing time in this case is about ten seconds. 





Fig. 10. Comparison of the validated and Adams-Moulton-Bashfort solution for a 
simple pendulum with a damper: trajectories {left) and their close-ups {right) 



The example above is one of the simplest. Yet, in more complicated cases the 
extension does not work that fast. For instance, it takes eleven hours to model 
a triple pendulum. This system has three independent variables (angles of the 
three arms), so its state-space model has six unknowns, which is a few in terms 
of physical systems. However, this model is relatively complicated from the point 
of view of the transmission elements involved. The more transmission elements 
are used, the larger the computational graphs are and the longer it takes to 
traverse them. 

The important loss of time is caused by the member function SolveLinear- 
System of the class MoADIntervalEqmBuilder. It was not necessary to solve a 
system of linear equations in the first example. The mass and the force being 
scalar, we simply divided the former by the latter to obtain the sought out second 
equation of the state-space representation. In case of the triple pendulum it be- 
comes necessary, because the dimension is higher and the quantities in question 
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are no more scalar. We have to use validated methods to solve the respective sys- 
tem of linear equations, which is very time consuming for the reasons mentioned 
at page 150. 

The arms of the triple pendulum start to move from their initial angles 
/3i = 30°, /?2 = /?3 = 40° with zero velocities. The system is considered over the 
time interval [0; 8]. Again, the results shown in Fig. 11 demonstrate the overall 
similarity of the solution trajectories obtained with the QR-factorization interval 
Taylor series algorithm and the AMB method. 




Fig. 11. Comparison of the validated and Adams-Moulton-Bashfort solution for a 
triple pendulum: trajectories (left) and their close-ups (right) 



The diagram to the right compares the position obtained with the help of the 
QR-factorization algorithm using constant and variable stepsize control strate- 
gies with the AMB solution. Again, to see the differences clearer, the three 
trajectories are represented in relation to the midpoints of obtained intervals. 
This time the AMB trajectory lies within interval boundaries and its deviance 
from points of reference is indiscernible in the scale of 10 -6 . The intervals ob- 
tained with variable stepsize control strategy have in average bigger diameters 
than those of constant stepsize strategy, but the computing time increases con- 
siderably in the latter case. Judging by the scale of the diagram, the obtained 
intervals are not as tight as in the first example, which is caused partly by the use 
of the linear systems solver, partly by the validated method itself and the lack of 
interval optimization in transmission elements. Besides, the system in question 
is chaotic, that is, extremely sensitive to the initial conditions (so that initial 
nearby points can evolve quickly into very different states). It is interesting to 
point out that we obtained better enclosures for non-chaotic systems with point 
interval initial values. The separation of the influences of the wrapping effect 
and chaotic solution behavior on the results and the investigation on intercon- 
nections of chaotic systems and intervals may be a promising topic for further 
research. 
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4.5 Example: Point-Tracking Manipulator 



As a close to life example of verified modeling in MOBILE, consider a two 
armed manipulator conceived to track a point P with the help of a camera. P 
moves along a straight line C (Fig. 12, left). The manipulator consists, as in 
the case of the example of section 3.3, of two revolute joints and two rigid links 
with masses, where all variable names are retained. The tracking of the motion 
of P is accomplished by measuring the distances g z , g y of P relative to the 
coordinate planes orthogonal to the z and y axes of the end-effector, respectively, 
which corresponds to a simple camera model. Tracking error is compensated by 
applying a force 



Qg 



Pg + I I g(r)dr + Dg 

Jt=0 



(9) 



in the direction of the corresponding axis on the end effector, where P, I , D are 
the constants of a corresponding PID controller. The overall differential equa- 
tions involve two second order equations of the form 



M (q) q + b(q, q) = Q(q, q, x; t) , (10) 

where q = [/?i , /3 2 ] T ■. and x represent two additional variables stemming from 
the ordinary first order differential equations 

x = [, x i , x 2 ] T = [ g y , g z ] T (11) 

by which the integral in the PID controllers are substituted by the new variables 
Xj. Thus, one obtains an overall system of six first order differential equations 
with the initial values for q corresponding to the initial configuration of the 
manipulator and x 0 = [0, 0] T . Assuming that the point P is moving by a known 
function of time, one obtains a non-autonomous system of differential equations. 

To model the “PID-belravior” of this system, the new force transmission 
element MoPIDForce is required as well as its verified version. Two elements of 
this type are used to track the point and a MoHarmonicVibration to move it. 

Fig. 12 (right) shows the position and velocity of the camera over the time 
interval [0; 17], obtained with the QR factorization (the order of Taylor decom- 
position is 24, tolerances are set to 1CP 8 ) and AMB algorithms. Once again we 
observe the overall similarity of the respective trajectories as well as the tightness 
of the enclosures. Regrettably, the computing time amounts to ten hours, which 
is nonetheless faster than for the triple pendulum, because the linear equations’ 
system to be solved is smaller. As already mentioned, we continue working on 
this problem. 

It is interesting to point out that we had to model the dynamics for the 
non-autonomous system here, whereas the assumption for IIVPS (section 4.2) 
was autonomy. Having no expressions for the resulting differential equations, we 
were not able to transform the original system into the autonomous one. But 
using algorithmic differentiation, it was possible to solve the problem not only 
in this particular case, but also in general. 
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Fig. 12. The point-tracking manipulator (left) and the position and velocity of its 
camera obtained with validated and Adams-Moulton-Bashfort algorithms (right) 



Another interesting question connected with the example would be model- 
ing dynamics of the manipulator with MylntervalSlacknessJoint instead of 
MoElementaryJoint, as shown in section 3.5 for its kinematics. The implemen- 
tation in this case presupposes the enhancement of wrapping effect’s reduction 
methods, described in section 3.5, with algorithmic differentiation and their in- 
tegration into the IIVPS system of MOBILE. The current task for the authors 
is to allow for this modeling possibility, that is, to bring software from sections 3 
and 4.4 together. 

Modeling with MylntervalSlacknessJoint would provide for computations 
with some uncertainty in the parameters, which would help to calculate, for ex- 
ample, their tolerances. That means, how much are the axes of the arms allowed 
to deviate from being concentric without influencing the overall system behavior 
too much, or how an inaccuracy in the length of the arms affects the tracking 
of the point. Until now, we were able to validate the numerical results and show 
their correctness. 



5 Conclusion: Prospects and Achievements 

We have shown how interval techniques and modeling software can be combined 
to the advantage of both: the latter acquires the opportunities of validated mod- 
eling and uncertainty treatment, the former its real life application. But we have 
also pointed out the areas, where “naive methods” of such an integration pro- 
duce unsatisfactory results and new algorithms, based on better understanding 
of MOBILE’S inner structures and principles have to be developed. Such is the 
improved treatment of interval transmission of velocity, acceleration, and force 
(similar to that presented in section 3.5) aiming at reducing the wrapping ef- 
fect caused by rotations, which is at the final stage of implementation now. 
Such is also the attempt to change MOBILE’S algorithm of building differential 
equations’ systems in state-space form using additional information provided by 
knowledge of derivatives, which will reduce computing time for the simulation 
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of dynamics. For this simulation level, implementing further verification algo- 
rithms, optimizing the data structures as well as a thorough solver’s testing are 
imminent. 

Kinematics and dynamics of mechanical systems can now be modeled with 
interval methods. To achieve this, basic mathematical objects, kinetostatic 
state objects, and kinetostatic transmission elements as well as dynamics’ pre- 
modeling objects from MOBILE were transformed to provide interval calculus 
along with automatic calculation of Taylor coefficients for the system itself and 
for the corresponding variational equation. As a result, the respective extensions 
of MOBILE were implemented. 

The connection of MOBILE and interval arithmetic allows for easier integra- 
tion of reliable algorithms based on intervals. For example, methods for verified 
distance calculation and an accurate fault tree algorithm for calculating a fail- 
ure distribution of a mechanical system using the failure distributions of its key 
subsystems will be adapted to MOBILE in the future. 
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Abstract. In this paper we discuss reliable methods in the field of finite 
precision geometry. We begin with a brief survey of geometric computing 
and approaches generally used in dealing with accuracy and robustness 
problems in finite precision geometry. Moreover, two reliable geometric 
algorithms based on these approaches are presented. The first one is 
a new distance algorithm for objects modeled in a common octree. The 
results are exact and include good bounds on all subdivision levels. Using 
smoother enclosures on the highest level, a link is provided to well-known 
algorithms for convex and non-convex objects. 

We discuss the general concept and advantages of special bounding vol- 
umes with representations directly connected to the representation of 
the enclosed object: Implicit and parametric Linear Interval Estimations 
(I)LIEs are roughly speaking, just thick planes enclosing the object. They 
are constructed using Taylor models or affine arithmetic. The particu- 
lar structure of (I)LIEs allows the construction of effective hierarchies of 
bounding volumes and the development of effective intersection tests for 
the enclosed object with rays, boxes and other LIEs. In addition, a fast 
reliable intersection test for two LIEs is presented in detail. 



1 Introduction 

Geometric algorithms are widely used in robotics, computer graphics, computer 
aided design or any simulations of a virtual environment. Common representa- 
tions for objects are constructive solid geometry models (CSG-models), bound- 
ary representation models (B-Rep-models) or tessellations (e.g. octrees). Single 
surfaces or surface patches are mostly represented in parametric or implicit form, 
or as subdivision surfaces. The choice of the appropriate representation is de- 
pendent on the application. 
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Fig. 1 . Applications of distance computation. 



Because exact modeling of an object is very time consuming and can be 
carried out only in certain special cases, polyhedral structures are recommended 
for path planning in robotics. Octrees are often used for scene reconstruction 
from sensor data. Parametric surfaces are an important tool for objects which are 
located near the robot. In the field of contact analysis and path planning, efficient 
distance and intersection algorithms play a decisive rule in most simulations. 

Distance algorithms are most frequently used in robotics (see Figure 1) and 
also in computer games not only to determine the distance between two obsta- 
cles in the environment of a robot or between a sensor point and an object, 
but also to obtain the results of difficult geometric comparisons without actually 
doing them. If we know that two surfaces are too far apart to intersect, we do 
not need the more expensive intersection calculations. Here bounding volumes 
are a common technique, which relies on a hierarchical model representation of 
the two surfaces using axis-aligned bounding boxes ( AABBs) , oriented bounding 
boxes (OBBs), parallelepipeds, discrete-orientation polytopes (DOPs), spheres, 
or new concepts of parameterized bounding volumes such as Linear Interval 
Estimations (LIEs) [7] or Implicit Linear Interval Estimations (ILIEs) [8]. Hier- 
archies of bounding volumes provide a fast way to perform collision detection 
even between complex models. The determination of the offset to a surface is 
another example of a problem which can be formulated in terms of distance 
computation. Hierarchical algorithms are also applied in computer graphics to 
perform point- or box-surface incidence tests and ray-surface or surface-surface 
intersections. Here, it is of interest not only to test whether an intersection exists, 
but also to compute the (exact) intersection set. Some applications for such al- 
gorithms are, for instance, the rendering of implicit and parametric surfaces, the 
voxelization of implicit objects, the computation of surface-surface intersections, 
and visibility computations. 
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Fig. 2. Accuracy and robustness problems. 



The methods mentioned here represent only a small selection of the geometric 
algorithms and structures commonly applied in the field of object modeling, 
contact analysis and path planning. 

Usually, they are sophisticated algorithms designed and proven to be correct 
for objects defined over the domain of real numbers which can only be approxi- 
mated on the computer. Due to rounding errors many implementations of geo- 
metric algorithms simply compute the wrong results for input values for which 
they are supposed to work. Numerical non-robustness in scientific computing is 
a well-known and widespread phenomenon. The implementation of an algorithm 
is in general considered robust if its output is always the correct response to 
some perturbation of the input, and stable if the perturbation is small. 

Although non-robustness is already an issue in a purely numerical computa- 
tion, it is more intractable in a geometric one. To appreciate why the robustness 
problem is especially hard for geometric computation, we need to understand 
what makes a computation geometric. Geometric computation involves not only 
numerical computations but also combinatorial structures as well as certain non- 
trivial consistency conditions between the numerical and combinatorial data. 
Consequently, in purely numerical computations a result becomes unusable when 
there is a severe loss of precision. In geometric computations errors become se- 
rious when the computed result leads to inconsistent states of the program or is 
qualitatively different from the true result, e.g. combinatorial structure is wrong. 
Accordingly, a loss of robustness related to geometric algorithms must always be 
understood in both its numerical and its topological meanings (see Figure 2). 

Researchers trying to create robust geometric software use one of two ap- 
proaches. The first is some form of exact computation in which every numerical 
quality is computed exactly (explicitly, if possible) and which relies on big num- 
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ber packages and use filters to make this approach viable. Alternatively, they 
can continue to use floating-point or some other finite precision arithmetic, and 
try to make their computation robust. 

Although exact computation is a safe method of achieving robustness, it 
is somewhat inefficient for most robotic applications. Exact geometric compu- 
tation requires that every evaluation is correct, which can be achieved either 
by computing every numeric value exactly (e.g. using exact integer or rational 
arithmetic) or by employing some implicit or symbolic representation that allows 
values to be computed exactly. But an exact computation is only possible when- 
ever all numeric values are algebraic or if the result of the geometric algorithm 
depends only on the signs of some quantities to be known (such information’s 
can be obtained with adaptive methods). Furthermore, the cost of an arithmeti- 
cal operation is no longer constant, as in the case of floating-point arithmetic, 
but depends upon its context and increases due to geometric constructions in 
which a new geometric structure is produced from an old one. Because of this 
perceived performance cost, the exact geometric computation does not appear 
to be widely used in robotics. Besides, in most robotic applications the input 
data are arbitrary real numbers (e.g. sensor data) which have to be cleaned up 
into exact values (e.g. an inexact input point can be viewed as the center of a 
small ball) before being fed to the exact algorithm. 

On the other hand, the common alternative to exact computation, finite preci- 
sion geometry, is faster, readily available, and widely used in practice; however 
exactness and robustness are no longer guaranteed. Here, correct and verifiable 
geometric reasoning using finite precision arithmetic is demanded. 

This paper aims to present new methods for the design of accurate and robust 
finite precision geometric algorithms which yield reliable results despite rounding 
errors caused by the limited precision of the computation. It begins with a short 
overview of the most common reliable techniques in the held of finite precision 
geometry: interval arithmetic or affine arithmetic, approaches which reduce the 
effect of overestimation caused by interval evaluations, Taylor models, and the 
exact scalar product. 

Section 3 proposes a new algorithm for distance computation between octrees 
based on the use of the exact scalar product. Another center of interest in this 
section is the development of efficient and accurate algorithms for distance cal- 
culation between a sensor point fixed on a robot and a target or obstacle (or 
obstacles) in a complex environment. An accurate distance algorithm for convex 
and non-convex polylredra with a priori error bounds of the computed values is 
provided. Robust solutions to these geometric problems are used in collision- free 
path planning if a given end-effector is moving amid a collection of (un)known 
obstacles from an initial to a desired final position as well as in dealing with the 
resulting contact problems. The advantages of the special structure of (implicit) 
linear interval estimations computed using Taylor models and affine arithmetic 
are demonstrated in Section 4, followed by a detailed discussion of robust inter- 
section and enumeration algorithms for implicit and parametric surfaces based 
on spatial subdivision. Finally, Section 5 summarizes the results. 
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2 Handling of Robustness Problems 

Because there is no general theory on how to deal with them, the handling 
of robustness problems in finite precision geometry takes a number of different 
approaches. In order to avoid inconsistent decisions these fall into two categories. 
The first places higher priority on topological and combinatorial data, while the 
second emphasizes numerical data. 

The topology-oriented approach leads to robust algorithms which never crash 
and compute output with essential combinatorial properties, but the computed 
numerical values do not necessarily correspond to the real solution of the geo- 
metric problem being addressed. Typically a topology-oriented algorithm does 
not treat sign computations producing sign zero. In those cases where the nu- 
merical value of a sign computation is zero, it will be replaced by a positive or 
negative value, whichever is consistent with the current topology. For this reason 
the topology-oriented approach is not suitable for certain computations, such as 
determining the real distance points between two objects. 

In such cases numerical approaches are more appropriate. Their typical strate- 
gies are based on an association of tolerances to geometric objects in order to 
represent uncertainties. The representation of a value by an approximation and 
an error bound or an interval is a numerical analogue of these strategies. In this 
context the term interval geometry can also be found [33] . 

2.1 Interval Arithmetic 

Approximation and error bounds define an interval that contains an exact value. 
In interval arithmetic the real numbers are stored as intervals with floating-point 
endpoints. Computations on the numbers are performed as sets of computations 
on the interval bounds, e.g. [a, b] + [c,d\ = [a + c,b + d\. Interval arithmetic 
is the most common technique providing reliable solutions for many numerical 
problems. Unfortunately, overestimation resulting from standard interval eval- 
uations is an often criticized drawback of interval arithmetic. See Alefelcl and 
Herzberger [1] for further reading. 

2.2 Epsilon Geometry 

Another method closely related to interval arithmetic is epsilon geometry, which 
was defined by Guibas, Salesin and Stolfi [21] and uses an epsilon predicate 
instead of a Boolean value to obtain information on how much the input satisfies 
the predicate. An epsilon predicate returns an interval that identifies a region 
over which the predicate is definitely true, definitely false or simply uncertain. So 
far, epsilon geometry has been applied only to a few basic geometric predicates. 
Moreover, it is not clear how to handle the regions of uncertainty. 

2.3 Affine Arithmetic 

Affine arithmetic, first proposed by Comba and Stolfi [11], is an extension to 
interval arithmetic which reduces the effect of overestimation by taking into 
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account the dependencies of the uncertainty factors of input data, approximation 
and rounding errors. In this way, error expansion can often be avoided and tighter 
bounds on the computed quantities achieved. 

When using this approach, each numerical number is stored as an affine form 

X = X 0 + X\E\ + X2S2 + ■ ■ ■ + x n £ n , (1) 

where e,; £ [ — 1 , 1] denotes a noise symbol representing one source of error or 
uncertainty. Xo is the central value of the affine form and the Xi are partial 
deviations. For each new source of error a new noise symbol e* is introduced and 
added to the affine form. 

Each interval can be expressed as an affine form, but an affine form can only 
be approximated by an interval as it carries much more information. An interval 
describes only the general uncertainty of the data, whereas affine arithmetic 
splits this uncertainty into specific parts. Thus, a conversion from affine forms 
to intervals in most cases implies a loss of information. 

Let x := xo + x\E\ + X 2 t 2 + .... + x n e n be the affine form of the fuzzy quantity 
x. x lies in the interval 



[x] := [x 0 -£,x 0 + £]; f := ^ \xi\ 

i — 1 

[a;] is the smallest interval enclosing all possible values of x. 

Let X = [a, b } be an interval representing the value x. Then x can be repre- 
sented as the affine form 

x = x 0 + x k e k 

with xo '■= ( b + a) /2; x k '■= (b — a)/ 2. 

Affine arithmetic is slower than standard interval arithmetic, but in cases 
where there might be error correlation from one computation step to the next, 
this approach is beneficial. 

2.4 Arithmetical Approaches 

Certain approaches might be described as being based primarily on arithmetical 
- as opposed to geometric - considerations. A highly precise evaluation of arith- 
metical expressions provides a solid tool for the solution of various geometric 
problems. The idea of arithmetical approaches is to isolate the basic operations 
(primitives) which have to be handled in a numerically correct way, where the 
manner in which the respective operands are represented is crucial. The primi- 
tives have to be implemented in such a way that they yield a result which is as 
close as it can be to the best possible machine representation. The computational 
depth of geometric algorithms has to be kept low to control the propagation of 
round-off errors. 

Since scalar products occur frequently and are important basic operations in 
many geometric computations, it is advantageous to perform the scalar product 
calculation with the same precision as the basic arithmetical operations. Using 
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the exact scalar product delays the onset of qualitative errors and improves 
the robustness of the implementation. Other arithmetical approaches, like the 
permutation of operations combined with random rounding (up and down), can 
also be used [33]. 



2.5 Taylor Models 

The idea of this approach is the representation of a (multivariate) function as a 
Taylor polynomial plus an interval that encloses the range of the remainder: the 
Taylor model of the function. 

Definition 1. Let be f £ C n+1 (D)\ D C lR m and B € UR m an interval box with 
B C D. Let T be the Taylor polynomial of order n of f around the point Xq £ B. 

— An interval I with Wx £ B : f(x) — T(x) £ I is called an n-th order 

Remainder Bound of / on B. 

— A pair (T, I) is called an n-th order Taylor model of /. 

— A set of all remainder bounds is called the Remainder Family, the optimal 
enclosure of the remainder is called the Optimal Remainder Bound. 

Thus, a Taylor model is a polynomial of n-th order enclosing the approxi- 
mated function on the interval box B. 

Berz and Hofstatter [5] define an arithmetic for Taylor models based on uni- 
and bivariate arithmetical operators and basic functions, ft turns out that these 
methods are similar to interval arithmetic for the case n = 0. 

Taylor models have a remarkable feature with respect to the quality of the 
approximation and its convergence: If B decreases, / will decrease in size as the 
(n+l)-st power of the size of the box B. 



3 Accurate Distance Algorithms 

Obstacles are often modeled or reconstructed from sonar and visual data leading 
to uncertain information. Descriptions based on polyhedral or hierarchical octree 
structures lead to a considerable reduction of data, which makes effective storing 
and processing possible. First, we will deal with objects represented by an octree 
in three dimensions and then with a more general n-tree in higher dimensions. 

Octrees are very suitable for building environments where obstacles must be 
taken into account when considering collision-free path planning as they enable 
the location of free and occupied regions based on accurate distance calculations. 

In Figure 3 a non-convex object is represented by an axis-aligned, level-three 
octree. The round nodes are gray because they have white and black leaves. 
Since the octree is constructed through the subsequent division of boxes, all 
constructed nodes are boxes whose boundary representations can be computed 
using an appropriate fixed-point arithmetic. 
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Fig. 3. Octree representing a non-convex object. 



3.1 An Accurate Distance Algorithm for Octrees 

The distance calculation between two objects represented by a common octree 
which has depth N and extra color information in gray nodes is based on a 
simple computation of the distance between two boxes. 



distance surface/surface 



intersection 






distance edge/edge 



distance vertex/vertex 



Fig. 4. Various examples of positioning two boxes. 



First, we establish a procedure dist 2 (Q\ 1 Q2) for the rectilinear axis-aligned 
boxes Qi,Q 2 described by a vertex point with the smallest coordinates and 
the length of three edges: 

Qi ■ [At j X2 , A3 ,hi,fi2, /13] = h x I2 x I3 

Q 2 '■ [Xlj ^2) 5 ^ 3 , ki, k2, £3] = Jl X J2 X J3 
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We introduce a case-selector determining by where the first box lies with respect 
to the other (outside below or above, cutting): 

f (Y n ~ X n — hn) 2 , Y n > X n + h n 
c n := \ (X n Y n k n ) , X n > Y n -t- k n , n — 1, 2, 3. 

[ 0, otherwise 

The following cases appear (including also the other cases surface to vertex etc.): 

— Intersection: 

h n Ji ^ 0 A I 2 n J 2 yf 0 A I 3 n J 3 0 => dist 2 = 0 

— Surface to surface (the distance vector may move on opposite facets; l,m,n 
pairwise disjoint): 

II n Jl ± 0 A lm n Jm ± 0 A I n D J n = 0 => dist 2 = C n 

— Edge to edge (the distance vector may move on opposite edges): 

II n Jl ± 0 A Im n Jm = 0 A In H J n = 0 => dist 2 = C n + C m 

— Vertex to vertex: 

h n Jl = 0 A I 2 n J 2 = 0 A h n j 3 = 0 => dist 2 = Cl + C2 + C3 

If the entries X±, X 2 , ^3, X\ + hi, X 2 + h 2 , X 3 + /13, Y), Y 2 , Y3, Y\ + k\, 1 2 + 

Y3 + ks are machine numbers, the square of the distance can be calculated up 
to 1 ulp with the aid of the exact scalar product. If a fixed point arithmetic is 
used, the results are exact. 

We will now assume that the octree represents two objects, a white (w) and a 
black (b) one, and that the leaves are integrally white or black depending on 
the represented object or red (r) for the free space. We further assume that the 
octree has no bw-boxes which would yield dist 2 = 0. 

The second part of our algorithm computes the distance between the two objects 
using the distance formulae between two cubes from part one: 

— Initialize the lists LB, LW , LG, the distance D = 3, and boxes W = 
[0,0, 0,0, 0,0] and B = [1, 1, 1, 0, 0, 0], 

/*The lists LB and LW are void, LG contains the unit cube. LB contains 
actual black boxes, LW contains actual white boxes, LG contains gray boxes 
of the *-th level */ 

— For all levels i = 0,1, ..., N /* N depth of the octree */ do 

/* Step 1: Fill lists LW, LB */ 
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For all children Q of all boxes of size 2 1 on level i 

/* Update LG */ 

If Q = white then 
{ Q -» LW; For all T G LB do 
if ( dist 2 (Q,T ) < I?) 

then { D := dist 2 (Q, T); W := Q; B := T } 

} 

else if Q = black then 
{ <2 -» LB; For all T £ LW do 
if ( dist 2 (Q,T ) < D) 

then { D := dist 2 (Q,T); W := T; B := Q } 

} 

/* Two or more different kinds of subboxes*/ 
else if Q = gray then Q — » LG: 

/* Step 2: Drop all irrelevant boxes; define min(0) = 0*/ 

For all T G LB, T ± B do 

For all Q € LG with attribute wr or bwr calculate dist 2 (Q,T); 
dist^ r := min {dist 2 (Q,T)\Q has attribute wr} ; 
dist 2 wr := min {dist 2 (Q, T)\Q has attribute bwr} ; 
if dist^ r > D and dist 2 wr > 3 • 2 -2 * -2 then drop T in LB; 

For all T G LW. , T ^ W do 

For all Q G LG with attribute br or bwr calculate dist 2 (Q,T ); 
dist 2 r := min {dist 2 (Q,T)\Q has attribute br} ; 
distlwr '■= min {dist 2 (Q, T)\Q has attribute bwr} ; 
if di,st 2 ir > D and dist 2 . wr > 3 • 2 -2 * -2 then drop T in LW; 
return D. 

3.2 Remarks 

This algorithm can be modified to return a list of all solutions. To this end, it is 
necessary to establish a list of pairs of boxes with the same temporary distance. 
The algorithm provides good upper and lower bounds: the temporary distance 
D is an upper bound, but we may use D = 3 • 2 if there is a bwr-box on level 
i. It is also possible to compute lower bounds. To this behind determine the 
greatest level i with bwr- boxes. Replace on an arbitrary level j > i all br-boxes 
with black boxes and all wr-boxes with white boxes. Then apply the algorithm 
to return D as a lower bound. 

The algorithm works in any higher dimension when the definition of the case- 
selector is generalized to arbitrary dimensions. 

On level i we find 2 6 ' 1+1 ' /3 as an upper bound for the number of box compar- 
isons and distance calculations. Thus, in the worst case, overall complexity is 
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0(2 6 ( Ar+:L )) . If we do not drop irrelevant black and white boxes, the complexity 
is bounded by the product of the number of black and white boxes. 

On the highest level tighter (convex) enclosures of the objects inside the boxes 
can be used to obtain better bounds for D. Then the simple distance computa- 
tions in the first step are replaced by an algorithm for convex objects. 

For an implementation it is not necessary to create the lists LB, LW, LG. All 
work can be done on the underlying data structure by traversing the octree in a 
certain manner and using appropriate flags in the nodes. 



3.3 Examples 

The first example concerns a level-three quadtree. In executing the algorithm the 
white box on the right-hand side is dropped. The result is found in the second 
and third quadrant. By applying a convex hull algorithm on the set of extreme 
vertices we find simple convex enclosing sets. 




Fig. 5. Quadtrees and convex hull of 
two objects. 



Fig. 6. Octree with two objects on level 
three. 



The convex hull of the extreme vertices is shown in Figure 5. The distance 
remains unchanged. In the next example (see Figure 6) the algorithm eliminates 
the boxes near the boundary z = 1 with respect to the coordinate system shown 
in Figure 3. 

3.4 Convex Hulls 

Now let us turn our attention to the objects obtained by representing three- 
dimensional convex sets S by octrees to apply distance theorems for this kind 
of sets. If the sets are non-convex, they can be split into convex parts. Building 
the octree corresponds to a certain kind of rasterization. So the question arises 
whether the objects are digital convex. If we replace each box on the highest 
level with its center point x we obtain sets of grid points Sa- This approach 
allows us to apply results from digital convexity (d.c.): 
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Fig. 7. Parabolic objects - level 5. 



Theorem 1 (see [16]). A digital set Sa Q Z d , the set of all d-dimensional 
vectors whose components have integer values, is digital convex if and only if 
for each point of x £ Z d \ Sa there is a hyperplane with normal vector x' and 
distance a to the origin such that x ■ x' = a and y ■ x' > a for all y £ Sa- If 
for each boundary point x of Sa there is a hyperplane such that x • x' = a and 
y ■ x' > a for all y £ Sa then Sa is digital convex. 

This theorem finds its analogous result due to Tietze in the context of continuous 
convexity. 

Unfortunately, Tietze’s theorem, which says that the condition x-x' = a and 
y-x' > a should be verified only locally when deriving continuous convexity, does 
not hold in the digital world. For this reason, a test for digital convexity cannot 
be done in a time proportional to the number of neighbors and boundary points, 
as was shown by a counter example given in Ecklrardt [16]. However, if the set 
Sa is simply connected and all the boundary points fulfill the interior point 
condition, i.e., each point x £ OSa has at least two 8-neiglrbor points belonging 
to Sa and these points are all connected in the 4-neiglrborhoocl topology, then 
the result of Tietze’s theorem holds true. 

A simpler way to proceed is to use the concept of extreme vertices of boxes 
on the boundary. A vertex is said to be an extreme vertex if none of the adjacent 
boxes belongs to the object. In the case of a quadtree there are three neighboring 
boxes; for octrees there are seven boxes. The convex hull of all extreme vertices 
is constructed to obtain an enclosure of the object. Obviously, the convex hull 
also contains the original set S. 

Then we can apply our distance algorithms for convex sets and obtain lower 
bounds for the distances. This approach also opens the way to dynamic algo- 
rithms for moving objects. It is well known that rotational motions of octrees 
lead to an unwanted wrapping-effect, which can be avoided by using the convex 
hulls of the objects [25]. 
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3.5 Accurate Distance Algorithms for Convex and Non-convex 
Polyhedra 

Generally, distance algorithms focus on objects represented by convex polyhe- 
dra, which are defined as the convex hull of points in three-dimensional space. 
Although these approaches can be applied for convex polytopes (bounded poly- 
hedra) in three-dimensional space, a wider class of objects is permitted since it 
is also possible to treat conveniently non-convex shapes as a union of convex 
poly topes. 

There are two main classes of distance algorithms for convex polyhedral 
models. In the first class algorithms are based on Voronoi regions, like the Lin- 
Canny (LC) algorithm [24] and its software implementations, such as I-Collide 
[10], V-Clip [27], or SWIFT+- 1- [17]. Another class is the simplex-based Gilbert- 
Johnson-Keerthi (GJK) algorithm [19] and its various extensions, including non- 
convex objects [29] and proximity queries with collision detection [4]. 

One drawback of the original LC algorithm is that it does not readily handle 
penetrating polyhedra; a second is its lack of robustness when applied to models 
in degenerate configurations. The GJK-like algorithms are more robust than LC; 
they can also handle penetration cases. Nonetheless, with GJK-like algorithms, 
computations generally require more floating-point operations. The collision de- 
tection library Q-Collide [9] was spawned from I-Collide, which replaces LC with 
the GJK algorithm for low-level collision detection. A numerical comparison of 
some derivations of GJK and LC algorithms was done in [20] . 

Although the GJK algorithm is widely used in robotics, there has been no 
verification of the computed results. For this reason, we have implemented an 
interval version of the GJK distance algorithm for tracking the distance between 
convex polyhedra which is adapted to sensor-based input data [15]. 

We are also interested in simple accurate algorithms to calculate the dis- 
tance between two objects, such as points, collections of axis-aligned boxes, 
(non-)convex polyhedra or NURBS-surfaces with interval vertices. Accurate fi- 
nite precision algorithms have been developed based on suitable projections and 
using controlled rounding and the exact scalar product whereby a verified en- 
closure of the solution is ensured [12]. 

If the end-effector or the sensor is taken to be a single moving point, an 
efficient distance algorithm, which does not rely on convex properties and thus 
is applicable to non-convex polyhedral surfaces has been developed [13]. Under 
the same assumption the problem has been solved for the more difficult case of 
NURBS-defined solids based on subdivision techniques and using an algorithm 
for the solution of nonlinear polynomial systems proposed by Sherbrooke and 
Patrikalakis [30]. The extension of this algorithm introduces interval arithmetic, 
the interval version of the convex hull algorithm, and a modified Simplex algo- 
rithm. The new solver allows a verification of obtained results [14] using new 
criteria to guarantee the existence of zeros within the calculated inclusions [18]. 

Our algorithm to compute the distance between a point and a non-convex 
polyhedron does not require decomposing the polyhedron into convex parts or 
iteration and yields the result with high accuracy [13]. It is possible to derive 




Reliable Distance and Intersection Computation 



173 



the explicit absolute or relative errors in a real distance point and the distance 
value to the (non-)convex polyhedron as well as the computed approximations 
of these values. 

3.6 An Accurate Distance Algorithm between a Point and a 
(Non-) Convex Polyhedron 

Given a point y outside a non-degenerated polyhedron P bounded by dP := 
{Si, i = 1 , . . . , m; = with m facets and n vertices. 

In the following, the vertices belonging to the facet Si are denoted by Sik, 
k = 1 ,...,ti, ti > 2, given in counter-clockwise order, and by Si(fc+i)], 
the edges of the facet S); k = 1, . . . , ti, s^ t . + 1) := Sji. 




[Sik ) ^i(fc + l)] 

Fig. 8. A point y and a non-convex polyhedron P. 



We are searching for the shortest straight line segment [y,x] between point y, 
which is any point outside of polyhedron P, and this polyhedron with x € dP. 
At the beginning, before starting the distance algorithm, we calculate the cor- 
rectly rounded cross product 

Xli 2 = (®i 2 — Sil) X (s^3 — Si 2 ) = Sj 2 X Si 3 + X Si 2 — Sn X Sj3 

with xxy := {x 2 -y3-x 3 - 1/2, x 3 -y 1 -xi- y 3 , x x ■ y^-xo-yi ) for x = {x ll x 2 , x 3 ), 
y — (2/1 , 2/2, 2/3), and a normal vector n, = n^/ \J tit 2 ■ no) for all* = 1 , . . . , m. 
Then Ei denotes the plane described by 

Ei : x ■ rii — Sn ■ rii = 0 . 

For all scalar product computations the algorithm uses the exact scalar product 
followed by rounding (to nearest): 

A: We calculate the distances between point y and each plane Ei 



li — y • Tli Sji • Tli. 
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We store the sign of i = 1 , m for future use. There is at least one > 0 
therefore the set / := {* | U > 0} is not empty, and we can form the set J of 
all j G {1, . . . , n} with 3 i6 /Uj G Si and the set K of all pairs (s, r) with 

u r ] — s, v G {1, . . . , ?r}. 

Then, for all i G I, the projections onto Ei can be accurately calculated: 

Xj y li ' Tli. 

Next, we have to decide whether Xi is in Si. For that purpose we calculate the 
number of intersections of the ray Xi + t'(m — Xj), t' > 0, suitable m G Ei, 
with edges [s,*,, s,;(fc+i)], k = 1, . . . , U, avoiding vertices, by solving a system 
of two equations with two variables. These equations result from setting the 
first derivatives of the function 

f(t r ,t") = ||sjfc + t"(s,;( fe+ 1 ) - s ik ) - Xi - t'(m - x,:) || 2 

in the variables t" and t' to zero. If x, belongs to the polygonal surface Si, i.e. 
if the number of intersections is odd, we remove all edges from K belonging 
to Si and calculate for the remaining (s, r) G I\ the scalar products 

w s i := (y - Xi) ■ (v s - Xi) and w ri := (y - x, ; ) • (v r - x t ) 

and, if w S i < 0 and w r i < 0, we redefine K := A'\{(s,r)}. Then we set a 
distance-point x := x,; and the distance d := li or update them (if there are 
points with the same distance, the result of the algorithm will be a list of 
them), and stop the algorithm if A' = 0. 

B: If K ^ 0 after step A, then we have to decide for all edges with (s,r) G K , 
whether the projection of y to the line 

u(t) := v s + t (v r - v s ) 

meets a point with parameter 0 < t < 1. To do so, we form the accurately 
calculated scalar products 

k := (y - v s ) ■ (v r - v s ) and y := (v r - v s ) ■ (v r - v s ). 

If k < 0 or k > y, then the projection ray does not meet the section between 
v s and v r . Otherwise, the projection point on [u s ,u r ] is given by 

x sr :=v s + - (v r - v s ) 

/i 

and the square of the distance by 

d 2 ii v r ~ Vs) X (y - Vr)) ' ((^r - Vs) X (y - V r )) 
v r ■ v r — v r ■ v s — v s ■ v r + v s ■ v s 
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We replace J by J\{s, r}. Using the projection point x sr we calculate for all 
j € J the scalar products 

Wj sr ■= ( y x sr ) • (v j x sr ) 

and if Wj sr < 0 we set J := J\{j}. 

If the projection point is the nearest distance point, we update x with x sr 
and d with We stop the algorithm if J = 0. 

C: If J ^ 0 after step B, we compare the distance values of the paths joining 
point y and each vertex-point Xj, j £ J, with the distance found so far and 
update d and x if necessary. 

The accurate distance algorithm works in linear time 0{Cn) with an order- 
constant C depending on the number of successful projections onto facets and 
edges. Furthermore, it can be used to determine the local distance between a 
point and any polyhedral surface described by its vertices and oriented facets. 

3.7 Error Discussion 

Let £i < 2 -52 be the rounding error in the floating-point number space S := 
(. B,l,em,eM ) characterized by its base B , mantissa length l and [ em,eM ] the 
smallest and largest allowable exponents. Then, for the error estimation of a 
calculated distance point x = (xi, X 2 , £ 3 ) and the distance value d = \\y — x\\ in 
the cases discussed in steps A, B and C it can be shown [12] that the results in 
Table 1 are valid. 



Table 1 . Absolute or relative errors in the distance point and value 



Step: Error estimations 

A: x v = X v + <5i,„(11.032||y|| + 10.032 <n)£i 

d = D + 4.275j (||y|j + <Ji)ei (a point to a surface) 
B: x v = X v + <52,u(2.505||y| + 14.515^)^ 
d = D( 1 + 3.003<52£i) (a point to an edge) 

C: x v = X v , d — D( 1 + 1.765' 3 £i) (a point to a point) 



v = 1,2,3, Ui := max fc ||s ifc ||, < 1, |<5)| < 1, j = 1,2,3, 

D = \\y — X\\, X £ OP the real distance point 



3.8 Example 

The algorithm was implemented in C++ using the library Profil/BIAS [23]. 
Figure 9 shows the ASCII input file of a non-convex polyhedron. The input file 
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consists of two parts: the fourteen vertex points of the polyhedron in a Cartesian 
coordinate system as geometric information and their positions on its nine faces 
as topological information. The corresponding program layout for the point y 
lying outside of the polyhedron in the origin of the Cartesian coordinate system 
is shown on the opposite side of the figure. 
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Distance between point and polyhedron 

(found in step one): 

distance: 1.0 

pointy: (0.0, 0.0, 0.0) 

distance points: 

( 1 . 0 , 0 . 0 , 0 . 0 ), (- 1 . 0 , 0 . 0 , 0 . 0 ), ( 0 . 0 , - 1 . 0 , 0 . 0 ) 



Fig. 9. Distance computation: the input file and program layout. 



4 Reliable Intersection Algorithms 

In the previous section accurate distance algorithms widely used for path plan- 
ning in robotics were described. In computer graphics, it is important to know 
not only whether two objects intersect, but also where they intersect. Direct 
ray-tracing of parametric surfaces, rendering and voxelization of implicit curves, 
surfaces and volumes, as well as the computation of intersection curves are com- 
mon tasks. 



4.1 On Bounding Volumes and Subdivision 

If a direct solution to the problem is not possible (which is generally the case), 
the application of a divide-and-conquer strategy is a widespread approach. A 
common technique for reducing the computational complexity of intersection 
problems is to subdivide the complex object into simpler objects and to sim- 
plify the shape using bounding volumes. Divide-and-conquer approaches to solve 
object-object intersection problems find by definition all possible intersections, 





Reliable Distance and Intersection Computation 



177 



but due to the piecewise enclosure of the solution information on the overall 
topology of the intersection gets lost. Postprocessing steps like connectivity de- 
termination and sorting are necessary to restore this information. Solutions for 
this problem can be found in classical literature on computational geometry and 
e.g. in [2]. Classical bounding volumes are simple solids, such as axis-aligned 
or -oriented bounding boxes, parallelepipeds, polyhedra or spheres. In general 
they are computed using range analysis methods based on sampling, exploiting 
convex hull properties of control points, evaluation of derivatives, or applying 
affine or interval arithmetic. Bounding volumes should be a reliable enclosure of 
the object, which is not the case if sampling techniques are used to construct 
the bounding volume. The direct application of interval or affine arithmetic to 
compute a bounding volume produces reliable bounds, but these bounds overes- 
timate the object because functional dependencies are not taken into account, or 
are lost during conversion from affine forms to intervals. Axis-aligned bounding 
boxes are easy to compute and intersect easily with other axis-aligned bounding 
boxes or rays; thus, they are well-suited for rapidly providing an insight into 
the structure of an environment with obstacles and targets. However, in most 
cases they significantly overestimate curves and surface patches. Therefore, in 
subdivision-based algorithms many more steps are necessary to reach precision 
than when using the much better fitting parallelepipeds. On the other hand, 
an intersection test for two parallelepipeds, for instance, is very complex and 
time-consuming. Furthermore, all classical bounding volumes are solids, i.e. they 
provide information only on the location of the whole object. Yet, especially for 
intersection algorithms for parametric objects, in order to accelerate the compu- 
tation it would be interesting to be able to derive information on the location of 
the intersection of the enclosed objects in parameter space from the intersection 
of two bounding volumes. To summarize, the ideal bounding volume provides 
a tight and reliable enclosure of the object, is easily calculated, and intersects 
easily with other, similar bounding volumes. 



4.2 Linear Interval Estimations 

To overcome problems connected with classical bounding volumes, another form 
of enclosing objects satisfying the requirements for the ideal bounding volume 
listed above has been introduced for parametric and implicit objects: Linear 
Interval Estimations [7, 8] are defined as the linear approximation of the rep- 
resentation of the enclosed object combined with an interval estimation of the 
approximation error. An LIE is just a thick (lryper)plane, that can be under- 
stood as a continuous linear set of axis parallel bounding boxes. Furthermore, the 
representation of an LIE corresponds to the representation of the object. This 
means in the parametric case that the LIE can be parameterized in such a way 
that its parameterization corresponds to the parameterization of the enclosed 
object. Each point of the object is enclosed by an ’’interval point” (an interval 
box) of the LIE with the same parameters. In the case of the intersection of two 
LIEs this construction allows direct conclusions on the location of intersections 




